All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86
@ 2022-05-18  6:30 ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

This patch series add sysfs interface to control CPU's hardware
prefetch behavior for performance tuning from userspace for the
processor A64FX and x86 (on supported CPU).

Changes from v3: https://lore.kernel.org/lkml/20220420030223.689259-1-tarumizu.kohei@fujitsu.com/
  - remove hardware-dependent code from core driver
    (driver/base/pfctl.c)
  - simplifies implementation of register bit operations
  - extract the pseudo_lock patch as a separate patch
    https://lore.kernel.org/lkml/20220518045517.2066518-1-tarumizu.kohei@fujitsu.com/

[Background]
============
A64FX and some Intel processors have implementation-dependent register
for controlling CPU's hardware prefetch behavior. A64FX has
IMP_PF_STREAM_DETECT_CTRL_EL0[1], and Intel processors have MSR 0x1a4
(MSR_MISC_FEATURE_CONTROL)[2]. These registers cannot be accessed from
userspace.

[1]https://github.com/fujitsu/A64FX/tree/master/doc/
   A64FX_Specification_HPC_Extension_v1_EN.pdf

[2]https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
    Volume 4

The advantage of using this is improved performance. As an example of
performance improvements, the results of running the Stream benchmark
on the A64FX are described in section [Merit].

For MSR 0x1a4, it is also possible to change the value from userspace
via the MSR driver. However, using MSR driver is not recommended, so
it needs a proper kernel interface[3].

[3]https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about/

For these reasons, we provide a new proper kernel interface to control
both IMP_PF_STREAM_DETECT_CTRL_EL0 and MSR 0x1a4.

[Overall design]
================
The source code for this driver is divided into common parts
(driver/base/pfctl.c) and hardware dependent parts
(arch/x86/kernel/cpu/x86-pfctl.c and drivers/soc/fujitsu/a64fx-pfctl.c).
Architecture parts is described hardware-dependent processing. It must
create attributes for a specific cache level/type.
Common part is described hardware-independent processing. It create
sysfs using the attributes from architecture part.

This driver creates "prefetch_control" directory and some attributes
in every CPU's cache/indexX directory, if CPU supports hardware
prefetch control behavior.

Detailed description of this sysfs interface is in
Documentation/ABI/testing/sysfs-devices-system-cpu (patch8).

This driver needs cache sysfs directory and cache level/type
information. In ARM processor, these information can be obtained
from registers even without ACPI PPTT.
We add processing to create a cache/index directory using only the
information from the register if the machine does not support ACPI
PPTT and Kconfig for hardware prefetch control (CONFIG_HWPF_CONTROL)
is true in patch5.
This action caused a problem and is described in [Known problem].

[Examples]
==========
This section provides an example of using this sysfs interface at the
x86's model of INTEL_FAM6_BROADWELL_X.

This model has the following register specifications:

[0]    L2 Hardware Prefetcher Disable (R/W)
[1]    L2 Adjacent Cache Line Prefetcher Disable (R/W)
[2]    DCU Hardware Prefetcher Disable (R/W)
[3]    DCU IP Prefetcher Disable (R/W)
[63:4] Reserved

In this case, index0 (L1d cache) corresponds to bit[2,3] and index2
(L2 cache) corresponds to bit [0,1]. A list of attribute files of
index0 and index2 in CPU1 at BROADWELL_X is following:

```
# ls /sys/devices/system/cpu/cpu1/cache/index0/prefetch_control/

hardware_prefetcher_enable
ip_prefetcher_enable

# ls /sys/devices/system/cpu/cpu1/cache/index2/prefetch_control/

adjacent_cache_line_prefetcher_enable
hardware_prefetcher_enable
```

If user would like to disable the setting of "L2 Adjacent Cache Line
Prefetcher Disable (R/W)" in CPU1, do the following:

```
# echo 0 >
# /sys/devices/system/cpu/cpu1/cache/index2/prefetch_control/adjacent_cache_line_prefetcher_enable
```

In another example, a list of index0 at A64FX is following:

```
# ls /sys/devices/system/cpu/cpu1/cache/index0/prefetch_control/

stream_detect_prefetcher_dist
stream_detect_prefetcher_enable
stream_detect_prefetcher_strength
stream_detect_prefetcher_strength_available
```

[Patch organizations]
=====================
This patch series add hardware prefetch control core driver for A64FX
and x86. Also, we add support for A64FX and BROADWELL_X at x86.

- patch1: Add hardware prefetch core driver

  Adds a register/unregister function to provide sysfs interface to
  control CPU's hardware prefetch behavior. It creates the
  "prefetch_control" sysfs directory and some attributes.

- patch2: Add Kconfig/Makefile to build hardware prefetch control core
  driver

- patch3: Add support for A64FX

  Adds module init/exit code to create sysfs attributes for A64FX with
  "stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
  and "stream_detect_prefetcher_dist".

- patch4: Add Kconfig/Makefile to build driver for A64FX

- patch5: Create cache sysfs directory without ACPI PPTT for hardware
  prefetch control

  Hardware Prefetch control driver needs cache sysfs directory and cache
  level/type information. In ARM processor, these information can be
  obtained from register(CLIDR_EL1) even without PPTT. Therefore, we
  set the cpu_map_populated to true to create cache sysfs directory, if
  the machine doesn't have PPTT.

- patch6: Add support for x86

  Adds module init/exit code to create sysfs attributes for x86 with
  "hardware_prefetcher_enable", "ip_prefetcher_enable" and
  "adjacent_cache_line_prefetcher_enable".

- patch7: Add Kconfig/Makefile to build driver for x86

- patch8: Add documentation for the new sysfs interface

[Known problem]
===============
- `lscpu` command terminates with -ENOENT because cache/index directory
  is exists but shared_cpu_map file does not exist. This is due to
  patch5, which creates a cache/index directory containing only level
  and type without ACPI PPTT.

[Merit]
=======
For reference, here is the result of STREAM Triad when tuning with
the "s file in L1 and L2 cache on A64FX.

| dist combination  | Pattern A   | Pattern B   |
|-------------------|-------------|-------------|
| L1:256,  L2:1024  | 234505.2144 | 114600.0801 |
| L1:1536, L2:1024  | 279172.8742 | 118979.4542 |
| L1:256,  L2:10240 | 247716.7757 | 127364.1533 |
| L1:1536, L2:10240 | 283675.6625 | 125950.6847 |

In pattern A, we set the size of the array to 174720, which is about
half the size of the L1d cache. In pattern B, we set the size of the
array to 10485120, which is about twice the size of the L2 cache.

In pattern A, a change of dist at L1 has a larger effect. On the other
hand, in pattern B, the change of dist at L2 has a larger effect.
As described above, the optimal dist combination depends on the
characteristics of the application. Therefore, such a sysfs interface
is useful for performance tuning.

Best regards,
Kohei Tarumizu

Kohei Tarumizu (8):
  drivers: base: Add hardware prefetch control core driver
  drivers: base: Add Kconfig/Makefile to build hardware prefetch control
    core driver
  soc: fujitsu: Add hardware prefetch control support for A64FX
  soc: fujitsu: Add Kconfig/Makefile to build hardware prefetch control
    driver
  arm64: Create cache sysfs directory without ACPI PPTT for hardware
    prefetch control
  x86: Add hardware prefetch control support for x86
  x86: Add Kconfig/Makefile to build hardware prefetch control driver
  docs: ABI: Add sysfs documentation interface of hardware prefetch
    control driver

 .../ABI/testing/sysfs-devices-system-cpu      |  98 +++++
 MAINTAINERS                                   |   8 +
 arch/arm64/kernel/cacheinfo.c                 |  29 ++
 arch/x86/Kconfig                              |   6 +
 arch/x86/kernel/cpu/Makefile                  |   2 +
 arch/x86/kernel/cpu/x86-pfctl.c               | 258 ++++++++++++
 drivers/base/Kconfig                          |   9 +
 drivers/base/Makefile                         |   1 +
 drivers/base/pfctl.c                          | 180 +++++++++
 drivers/soc/Kconfig                           |   1 +
 drivers/soc/Makefile                          |   1 +
 drivers/soc/fujitsu/Kconfig                   |  11 +
 drivers/soc/fujitsu/Makefile                  |   2 +
 drivers/soc/fujitsu/a64fx-pfctl.c             | 373 ++++++++++++++++++
 include/linux/pfctl.h                         |  14 +
 15 files changed, 993 insertions(+)
 create mode 100644 arch/x86/kernel/cpu/x86-pfctl.c
 create mode 100644 drivers/base/pfctl.c
 create mode 100644 drivers/soc/fujitsu/Kconfig
 create mode 100644 drivers/soc/fujitsu/Makefile
 create mode 100644 drivers/soc/fujitsu/a64fx-pfctl.c
 create mode 100644 include/linux/pfctl.h

-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86
@ 2022-05-18  6:30 ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

This patch series add sysfs interface to control CPU's hardware
prefetch behavior for performance tuning from userspace for the
processor A64FX and x86 (on supported CPU).

Changes from v3: https://lore.kernel.org/lkml/20220420030223.689259-1-tarumizu.kohei@fujitsu.com/
  - remove hardware-dependent code from core driver
    (driver/base/pfctl.c)
  - simplifies implementation of register bit operations
  - extract the pseudo_lock patch as a separate patch
    https://lore.kernel.org/lkml/20220518045517.2066518-1-tarumizu.kohei@fujitsu.com/

[Background]
============
A64FX and some Intel processors have implementation-dependent register
for controlling CPU's hardware prefetch behavior. A64FX has
IMP_PF_STREAM_DETECT_CTRL_EL0[1], and Intel processors have MSR 0x1a4
(MSR_MISC_FEATURE_CONTROL)[2]. These registers cannot be accessed from
userspace.

[1]https://github.com/fujitsu/A64FX/tree/master/doc/
   A64FX_Specification_HPC_Extension_v1_EN.pdf

[2]https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
    Volume 4

The advantage of using this is improved performance. As an example of
performance improvements, the results of running the Stream benchmark
on the A64FX are described in section [Merit].

For MSR 0x1a4, it is also possible to change the value from userspace
via the MSR driver. However, using MSR driver is not recommended, so
it needs a proper kernel interface[3].

[3]https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about/

For these reasons, we provide a new proper kernel interface to control
both IMP_PF_STREAM_DETECT_CTRL_EL0 and MSR 0x1a4.

[Overall design]
================
The source code for this driver is divided into common parts
(driver/base/pfctl.c) and hardware dependent parts
(arch/x86/kernel/cpu/x86-pfctl.c and drivers/soc/fujitsu/a64fx-pfctl.c).
Architecture parts is described hardware-dependent processing. It must
create attributes for a specific cache level/type.
Common part is described hardware-independent processing. It create
sysfs using the attributes from architecture part.

This driver creates "prefetch_control" directory and some attributes
in every CPU's cache/indexX directory, if CPU supports hardware
prefetch control behavior.

Detailed description of this sysfs interface is in
Documentation/ABI/testing/sysfs-devices-system-cpu (patch8).

This driver needs cache sysfs directory and cache level/type
information. In ARM processor, these information can be obtained
from registers even without ACPI PPTT.
We add processing to create a cache/index directory using only the
information from the register if the machine does not support ACPI
PPTT and Kconfig for hardware prefetch control (CONFIG_HWPF_CONTROL)
is true in patch5.
This action caused a problem and is described in [Known problem].

[Examples]
==========
This section provides an example of using this sysfs interface at the
x86's model of INTEL_FAM6_BROADWELL_X.

This model has the following register specifications:

[0]    L2 Hardware Prefetcher Disable (R/W)
[1]    L2 Adjacent Cache Line Prefetcher Disable (R/W)
[2]    DCU Hardware Prefetcher Disable (R/W)
[3]    DCU IP Prefetcher Disable (R/W)
[63:4] Reserved

In this case, index0 (L1d cache) corresponds to bit[2,3] and index2
(L2 cache) corresponds to bit [0,1]. A list of attribute files of
index0 and index2 in CPU1 at BROADWELL_X is following:

```
# ls /sys/devices/system/cpu/cpu1/cache/index0/prefetch_control/

hardware_prefetcher_enable
ip_prefetcher_enable

# ls /sys/devices/system/cpu/cpu1/cache/index2/prefetch_control/

adjacent_cache_line_prefetcher_enable
hardware_prefetcher_enable
```

If user would like to disable the setting of "L2 Adjacent Cache Line
Prefetcher Disable (R/W)" in CPU1, do the following:

```
# echo 0 >
# /sys/devices/system/cpu/cpu1/cache/index2/prefetch_control/adjacent_cache_line_prefetcher_enable
```

In another example, a list of index0 at A64FX is following:

```
# ls /sys/devices/system/cpu/cpu1/cache/index0/prefetch_control/

stream_detect_prefetcher_dist
stream_detect_prefetcher_enable
stream_detect_prefetcher_strength
stream_detect_prefetcher_strength_available
```

[Patch organizations]
=====================
This patch series add hardware prefetch control core driver for A64FX
and x86. Also, we add support for A64FX and BROADWELL_X at x86.

- patch1: Add hardware prefetch core driver

  Adds a register/unregister function to provide sysfs interface to
  control CPU's hardware prefetch behavior. It creates the
  "prefetch_control" sysfs directory and some attributes.

- patch2: Add Kconfig/Makefile to build hardware prefetch control core
  driver

- patch3: Add support for A64FX

  Adds module init/exit code to create sysfs attributes for A64FX with
  "stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
  and "stream_detect_prefetcher_dist".

- patch4: Add Kconfig/Makefile to build driver for A64FX

- patch5: Create cache sysfs directory without ACPI PPTT for hardware
  prefetch control

  Hardware Prefetch control driver needs cache sysfs directory and cache
  level/type information. In ARM processor, these information can be
  obtained from register(CLIDR_EL1) even without PPTT. Therefore, we
  set the cpu_map_populated to true to create cache sysfs directory, if
  the machine doesn't have PPTT.

- patch6: Add support for x86

  Adds module init/exit code to create sysfs attributes for x86 with
  "hardware_prefetcher_enable", "ip_prefetcher_enable" and
  "adjacent_cache_line_prefetcher_enable".

- patch7: Add Kconfig/Makefile to build driver for x86

- patch8: Add documentation for the new sysfs interface

[Known problem]
===============
- `lscpu` command terminates with -ENOENT because cache/index directory
  is exists but shared_cpu_map file does not exist. This is due to
  patch5, which creates a cache/index directory containing only level
  and type without ACPI PPTT.

[Merit]
=======
For reference, here is the result of STREAM Triad when tuning with
the "s file in L1 and L2 cache on A64FX.

| dist combination  | Pattern A   | Pattern B   |
|-------------------|-------------|-------------|
| L1:256,  L2:1024  | 234505.2144 | 114600.0801 |
| L1:1536, L2:1024  | 279172.8742 | 118979.4542 |
| L1:256,  L2:10240 | 247716.7757 | 127364.1533 |
| L1:1536, L2:10240 | 283675.6625 | 125950.6847 |

In pattern A, we set the size of the array to 174720, which is about
half the size of the L1d cache. In pattern B, we set the size of the
array to 10485120, which is about twice the size of the L2 cache.

In pattern A, a change of dist at L1 has a larger effect. On the other
hand, in pattern B, the change of dist at L2 has a larger effect.
As described above, the optimal dist combination depends on the
characteristics of the application. Therefore, such a sysfs interface
is useful for performance tuning.

Best regards,
Kohei Tarumizu

Kohei Tarumizu (8):
  drivers: base: Add hardware prefetch control core driver
  drivers: base: Add Kconfig/Makefile to build hardware prefetch control
    core driver
  soc: fujitsu: Add hardware prefetch control support for A64FX
  soc: fujitsu: Add Kconfig/Makefile to build hardware prefetch control
    driver
  arm64: Create cache sysfs directory without ACPI PPTT for hardware
    prefetch control
  x86: Add hardware prefetch control support for x86
  x86: Add Kconfig/Makefile to build hardware prefetch control driver
  docs: ABI: Add sysfs documentation interface of hardware prefetch
    control driver

 .../ABI/testing/sysfs-devices-system-cpu      |  98 +++++
 MAINTAINERS                                   |   8 +
 arch/arm64/kernel/cacheinfo.c                 |  29 ++
 arch/x86/Kconfig                              |   6 +
 arch/x86/kernel/cpu/Makefile                  |   2 +
 arch/x86/kernel/cpu/x86-pfctl.c               | 258 ++++++++++++
 drivers/base/Kconfig                          |   9 +
 drivers/base/Makefile                         |   1 +
 drivers/base/pfctl.c                          | 180 +++++++++
 drivers/soc/Kconfig                           |   1 +
 drivers/soc/Makefile                          |   1 +
 drivers/soc/fujitsu/Kconfig                   |  11 +
 drivers/soc/fujitsu/Makefile                  |   2 +
 drivers/soc/fujitsu/a64fx-pfctl.c             | 373 ++++++++++++++++++
 include/linux/pfctl.h                         |  14 +
 15 files changed, 993 insertions(+)
 create mode 100644 arch/x86/kernel/cpu/x86-pfctl.c
 create mode 100644 drivers/base/pfctl.c
 create mode 100644 drivers/soc/fujitsu/Kconfig
 create mode 100644 drivers/soc/fujitsu/Makefile
 create mode 100644 drivers/soc/fujitsu/a64fx-pfctl.c
 create mode 100644 include/linux/pfctl.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v4 1/8] drivers: base: Add hardware prefetch control core driver
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds a register/unregister function to provide sysfs interface to
control CPU's hardware prefetch behavior. It creates the
"prefetch_control" sysfs directory and some attributes.

Attributes are hardware dependent, so it must be implemented for each
hardware. If CPU has a hardware prefetch behavior, call this function
to create sysfs.

Following patches add support for A64FX and x86.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 drivers/base/pfctl.c  | 180 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pfctl.h |  14 ++++
 2 files changed, 194 insertions(+)
 create mode 100644 drivers/base/pfctl.c
 create mode 100644 include/linux/pfctl.h

diff --git a/drivers/base/pfctl.c b/drivers/base/pfctl.c
new file mode 100644
index 000000000000..08ee8faaf277
--- /dev/null
+++ b/drivers/base/pfctl.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hardware prefetch control support via sysfs.
+ *
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * See Documentation/ABI/testing/sysfs-devices-system-cpu for more information.
+ */
+
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/device.h>
+#include <linux/pfctl.h>
+#include <linux/slab.h>
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+const struct pfctl_group *pgroups;
+enum cpuhp_state hp_online;
+
+static struct device_attribute **
+get_pfctl_attribute(unsigned int level, enum cache_type type)
+{
+	int i;
+
+	for (i = 0; pgroups[i].attrs; i++)
+		if ((level == pgroups[i].level) && (type == pgroups[i].type))
+			return pgroups[i].attrs;
+
+	return NULL;
+}
+
+static int remove_pfctl_attr(struct device *index_dev, void *data)
+{
+	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
+	struct device_attribute **attrs;
+	struct device *pfctl_dev;
+	int i;
+
+	attrs = get_pfctl_attribute(leaf->level, leaf->type);
+	if (!attrs)
+		return 0;
+
+	pfctl_dev = device_find_child_by_name(index_dev, "prefetch_control");
+	if (!pfctl_dev)
+		return 0;
+
+	for (i = 0; attrs[i]; i++)
+		device_remove_file(pfctl_dev, attrs[i]);
+
+	device_unregister(pfctl_dev);
+	put_device(pfctl_dev);
+
+	pfctl_dev = NULL;
+
+	return 0;
+}
+
+static int create_pfctl_attr(struct device *index_dev, void *data)
+{
+	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
+	struct device_attribute **attrs;
+	struct device *pfctl_dev;
+	int i, err;
+
+	attrs = get_pfctl_attribute(leaf->level, leaf->type);
+	if (!attrs)
+		return 0;
+
+	pfctl_dev = cpu_device_create(index_dev, NULL, NULL,
+				      "prefetch_control");
+	if (IS_ERR(pfctl_dev))
+		return PTR_ERR(pfctl_dev);
+
+	for (i = 0; attrs[i]; i++) {
+		err = device_create_file(pfctl_dev, attrs[i]);
+		if (err) {
+			while (--i >= 0)
+				device_remove_file(pfctl_dev, attrs[i]);
+
+			device_unregister(pfctl_dev);
+			pfctl_dev = NULL;
+
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static int pfctl_online(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct device *cache_dev;
+	int ret;
+
+	cache_dev = device_find_child_by_name(cpu_dev, "cache");
+	if (!cache_dev)
+		return -ENODEV;
+
+	ret = device_for_each_child(cache_dev, NULL, create_pfctl_attr);
+	if (ret < 0)
+		device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
+
+	put_device(cache_dev);
+
+	return ret;
+}
+
+static int pfctl_prepare_down(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct device *cache_dev;
+
+	cache_dev = device_find_child_by_name(cpu_dev, "cache");
+	if (!cache_dev)
+		return 0;
+
+	device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
+
+	put_device(cache_dev);
+
+	return 0;
+}
+
+/**
+ * pfctl_register_attrs - register a Hardware Prefetch Control attributes
+ * @pfctl_groups: pfctl_groups contains device attribute group to control the
+ *                hardware prefetch register.
+ *
+ * Note: Call this function after the cache device is initialized because it
+ * requires access to the cache device. (e.g. Call at the late_initcall)
+ *
+ * Context: Any context.
+ * Return: 0 on success, negative error code on failure.
+ */
+int pfctl_register_attrs(const struct pfctl_group *pfctl_groups)
+{
+	int ret;
+
+	if (pgroups)
+		return -EEXIST;
+
+	pgroups = pfctl_groups;
+
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "base/pfctl:online",
+				pfctl_online, pfctl_prepare_down);
+	if (ret < 0) {
+		pr_err("failed to register hotplug callbacks\n");
+		pgroups = NULL;
+		return ret;
+	}
+
+	hp_online = ret;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pfctl_register_attrs);
+
+/**
+ * pfctl_unregister_attrs - unregister the Hardware Prefetch Control driver
+ * @pfctl_groups: Used to verify that this function is called by the same driver
+ *                that called pfctl_register_attrs.
+ *
+ * Context: Any context.
+ * Return: nothing.
+ */
+void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups)
+{
+	if (!pgroups || (pfctl_groups != pgroups))
+		return;
+
+	cpuhp_remove_state(hp_online);
+
+	pgroups = NULL;
+}
+EXPORT_SYMBOL_GPL(pfctl_unregister_attrs);
diff --git a/include/linux/pfctl.h b/include/linux/pfctl.h
new file mode 100644
index 000000000000..ecdab78be09f
--- /dev/null
+++ b/include/linux/pfctl.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_PFCTL_H
+#define _LINUX_PFCTL_H
+
+struct pfctl_group {
+	unsigned int		level;
+	enum cache_type		type;
+	struct device_attribute	**attrs;
+};
+
+int pfctl_register_attrs(const struct pfctl_group *pfctl_groups);
+void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups);
+
+#endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 1/8] drivers: base: Add hardware prefetch control core driver
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds a register/unregister function to provide sysfs interface to
control CPU's hardware prefetch behavior. It creates the
"prefetch_control" sysfs directory and some attributes.

Attributes are hardware dependent, so it must be implemented for each
hardware. If CPU has a hardware prefetch behavior, call this function
to create sysfs.

Following patches add support for A64FX and x86.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 drivers/base/pfctl.c  | 180 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pfctl.h |  14 ++++
 2 files changed, 194 insertions(+)
 create mode 100644 drivers/base/pfctl.c
 create mode 100644 include/linux/pfctl.h

diff --git a/drivers/base/pfctl.c b/drivers/base/pfctl.c
new file mode 100644
index 000000000000..08ee8faaf277
--- /dev/null
+++ b/drivers/base/pfctl.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hardware prefetch control support via sysfs.
+ *
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * See Documentation/ABI/testing/sysfs-devices-system-cpu for more information.
+ */
+
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/device.h>
+#include <linux/pfctl.h>
+#include <linux/slab.h>
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+const struct pfctl_group *pgroups;
+enum cpuhp_state hp_online;
+
+static struct device_attribute **
+get_pfctl_attribute(unsigned int level, enum cache_type type)
+{
+	int i;
+
+	for (i = 0; pgroups[i].attrs; i++)
+		if ((level == pgroups[i].level) && (type == pgroups[i].type))
+			return pgroups[i].attrs;
+
+	return NULL;
+}
+
+static int remove_pfctl_attr(struct device *index_dev, void *data)
+{
+	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
+	struct device_attribute **attrs;
+	struct device *pfctl_dev;
+	int i;
+
+	attrs = get_pfctl_attribute(leaf->level, leaf->type);
+	if (!attrs)
+		return 0;
+
+	pfctl_dev = device_find_child_by_name(index_dev, "prefetch_control");
+	if (!pfctl_dev)
+		return 0;
+
+	for (i = 0; attrs[i]; i++)
+		device_remove_file(pfctl_dev, attrs[i]);
+
+	device_unregister(pfctl_dev);
+	put_device(pfctl_dev);
+
+	pfctl_dev = NULL;
+
+	return 0;
+}
+
+static int create_pfctl_attr(struct device *index_dev, void *data)
+{
+	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
+	struct device_attribute **attrs;
+	struct device *pfctl_dev;
+	int i, err;
+
+	attrs = get_pfctl_attribute(leaf->level, leaf->type);
+	if (!attrs)
+		return 0;
+
+	pfctl_dev = cpu_device_create(index_dev, NULL, NULL,
+				      "prefetch_control");
+	if (IS_ERR(pfctl_dev))
+		return PTR_ERR(pfctl_dev);
+
+	for (i = 0; attrs[i]; i++) {
+		err = device_create_file(pfctl_dev, attrs[i]);
+		if (err) {
+			while (--i >= 0)
+				device_remove_file(pfctl_dev, attrs[i]);
+
+			device_unregister(pfctl_dev);
+			pfctl_dev = NULL;
+
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static int pfctl_online(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct device *cache_dev;
+	int ret;
+
+	cache_dev = device_find_child_by_name(cpu_dev, "cache");
+	if (!cache_dev)
+		return -ENODEV;
+
+	ret = device_for_each_child(cache_dev, NULL, create_pfctl_attr);
+	if (ret < 0)
+		device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
+
+	put_device(cache_dev);
+
+	return ret;
+}
+
+static int pfctl_prepare_down(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct device *cache_dev;
+
+	cache_dev = device_find_child_by_name(cpu_dev, "cache");
+	if (!cache_dev)
+		return 0;
+
+	device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
+
+	put_device(cache_dev);
+
+	return 0;
+}
+
+/**
+ * pfctl_register_attrs - register a Hardware Prefetch Control attributes
+ * @pfctl_groups: pfctl_groups contains device attribute group to control the
+ *                hardware prefetch register.
+ *
+ * Note: Call this function after the cache device is initialized because it
+ * requires access to the cache device. (e.g. Call at the late_initcall)
+ *
+ * Context: Any context.
+ * Return: 0 on success, negative error code on failure.
+ */
+int pfctl_register_attrs(const struct pfctl_group *pfctl_groups)
+{
+	int ret;
+
+	if (pgroups)
+		return -EEXIST;
+
+	pgroups = pfctl_groups;
+
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "base/pfctl:online",
+				pfctl_online, pfctl_prepare_down);
+	if (ret < 0) {
+		pr_err("failed to register hotplug callbacks\n");
+		pgroups = NULL;
+		return ret;
+	}
+
+	hp_online = ret;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pfctl_register_attrs);
+
+/**
+ * pfctl_unregister_attrs - unregister the Hardware Prefetch Control driver
+ * @pfctl_groups: Used to verify that this function is called by the same driver
+ *                that called pfctl_register_attrs.
+ *
+ * Context: Any context.
+ * Return: nothing.
+ */
+void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups)
+{
+	if (!pgroups || (pfctl_groups != pgroups))
+		return;
+
+	cpuhp_remove_state(hp_online);
+
+	pgroups = NULL;
+}
+EXPORT_SYMBOL_GPL(pfctl_unregister_attrs);
diff --git a/include/linux/pfctl.h b/include/linux/pfctl.h
new file mode 100644
index 000000000000..ecdab78be09f
--- /dev/null
+++ b/include/linux/pfctl.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_PFCTL_H
+#define _LINUX_PFCTL_H
+
+struct pfctl_group {
+	unsigned int		level;
+	enum cache_type		type;
+	struct device_attribute	**attrs;
+};
+
+int pfctl_register_attrs(const struct pfctl_group *pfctl_groups);
+void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups);
+
+#endif
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 2/8] drivers: base: Add Kconfig/Makefile to build hardware prefetch control core driver
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds Kconfig/Makefile to build hardware prefetch control core driver,
and also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS           | 6 ++++++
 drivers/base/Kconfig  | 9 +++++++++
 drivers/base/Makefile | 1 +
 3 files changed, 16 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6d879cb0afd..f188403bc2e9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8620,6 +8620,12 @@ F:	include/linux/hwmon*.h
 F:	include/trace/events/hwmon*.h
 K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
 
+HARDWARE PREFETCH CONTROL DRIVERS
+M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
+S:	Maintained
+F:	drivers/base/pfctl.c
+F:	include/linux/pfctl.h
+
 HARDWARE RANDOM NUMBER GENERATOR CORE
 M:	Matt Mackall <mpm@selenic.com>
 M:	Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 6f04b831a5c0..8f8a69e7f645 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -230,4 +230,13 @@ config GENERIC_ARCH_NUMA
 	  Enable support for generic NUMA implementation. Currently, RISC-V
 	  and ARM64 use it.
 
+config HWPF_CONTROL
+	bool "Hardware Prefetch Control driver"
+	help
+	  This driver allows user to control CPU's Hardware Prefetch behavior.
+	  If the machine supports this behavior, it provides a sysfs interface.
+
+	  See Documentation/ABI/testing/sysfs-devices-system-cpu for more
+	  information.
+
 endmenu
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 02f7f1358e86..13f3a0ddf3d1 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
 obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o
 obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
 obj-$(CONFIG_GENERIC_ARCH_NUMA) += arch_numa.o
+obj-$(CONFIG_HWPF_CONTROL)	+= pfctl.o
 
 obj-y			+= test/
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 2/8] drivers: base: Add Kconfig/Makefile to build hardware prefetch control core driver
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds Kconfig/Makefile to build hardware prefetch control core driver,
and also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS           | 6 ++++++
 drivers/base/Kconfig  | 9 +++++++++
 drivers/base/Makefile | 1 +
 3 files changed, 16 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6d879cb0afd..f188403bc2e9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8620,6 +8620,12 @@ F:	include/linux/hwmon*.h
 F:	include/trace/events/hwmon*.h
 K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
 
+HARDWARE PREFETCH CONTROL DRIVERS
+M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
+S:	Maintained
+F:	drivers/base/pfctl.c
+F:	include/linux/pfctl.h
+
 HARDWARE RANDOM NUMBER GENERATOR CORE
 M:	Matt Mackall <mpm@selenic.com>
 M:	Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 6f04b831a5c0..8f8a69e7f645 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -230,4 +230,13 @@ config GENERIC_ARCH_NUMA
 	  Enable support for generic NUMA implementation. Currently, RISC-V
 	  and ARM64 use it.
 
+config HWPF_CONTROL
+	bool "Hardware Prefetch Control driver"
+	help
+	  This driver allows user to control CPU's Hardware Prefetch behavior.
+	  If the machine supports this behavior, it provides a sysfs interface.
+
+	  See Documentation/ABI/testing/sysfs-devices-system-cpu for more
+	  information.
+
 endmenu
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 02f7f1358e86..13f3a0ddf3d1 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
 obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o
 obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
 obj-$(CONFIG_GENERIC_ARCH_NUMA) += arch_numa.o
+obj-$(CONFIG_HWPF_CONTROL)	+= pfctl.o
 
 obj-y			+= test/
 
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 3/8] soc: fujitsu: Add hardware prefetch control support for A64FX
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds module init/exit code to create sysfs attributes for A64FX with
"stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
and "stream_detect_prefetcher_dist".

This driver works only if part number is FUJITSU_CPU_PART_A64FX. The
details of the registers to be read and written in this patch are
described below.

"https://github.com/fujitsu/A64FX/tree/master/doc/"
    A64FX_Specification_HPC_Extension_v1_EN.pdf

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 drivers/soc/fujitsu/a64fx-pfctl.c | 373 ++++++++++++++++++++++++++++++
 1 file changed, 373 insertions(+)
 create mode 100644 drivers/soc/fujitsu/a64fx-pfctl.c

diff --git a/drivers/soc/fujitsu/a64fx-pfctl.c b/drivers/soc/fujitsu/a64fx-pfctl.c
new file mode 100644
index 000000000000..f765f53987d4
--- /dev/null
+++ b/drivers/soc/fujitsu/a64fx-pfctl.c
@@ -0,0 +1,373 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * A64FX Hardware Prefetch Control support
+ */
+
+#include <asm/cputype.h>
+#include <linux/cacheinfo.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pfctl.h>
+#include <linux/smp.h>
+#include <linux/sysfs.h>
+
+/*
+ * Constants for these add the "A64FX_SDPF" prefix to the name described in
+ * section "1.3.4.2. IMP_PF_STREAM_DETECT_CTRL_EL0" of "A64FX specification".
+ * (https://github.com/fujitsu/A64FX/tree/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf")
+ * See this document for register specification details.
+ */
+#define A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0	sys_reg(3, 3, 11, 4, 0)
+#define A64FX_SDPF_V					BIT_ULL(63)
+#define A64FX_SDPF_L1PF_DIS				BIT_ULL(59)
+#define A64FX_SDPF_L2PF_DIS				BIT_ULL(58)
+#define A64FX_SDPF_L1W					BIT_ULL(55)
+#define A64FX_SDPF_L2W					BIT_ULL(54)
+#define A64FX_SDPF_L1_DIST				GENMASK_ULL(27, 24)
+#define A64FX_SDPF_L2_DIST				GENMASK_ULL(19, 16)
+
+#define PFCTL_MIN_L1_DIST				256
+#define PFCTL_MIN_L2_DIST				1024
+#define PFCTL_DIST_AUTO_VAL				0
+#define PFCTL_STRONG_VAL				0
+#define PFCTL_WEAK_VAL					1
+
+/*
+ * Define bitfield access macros for non-constant mask, because macros such as
+ * FIELD_FIT defined in include/linux/bitfield.h require constant mask.
+ */
+#define NC_FIELD_FIT(_mask, _val)					\
+	!((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask))
+
+#define NC_FIELD_PREP(_mask, _val)					\
+	(((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask))
+
+#define NC_FIELD_GET(_mask, _reg)					\
+	((typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)))
+
+struct a64fx_pfctl_attr {
+	struct device_attribute	attr;
+	u64			mask;
+	void			*data;
+};
+
+static const char strength_strong_string[] = "strong";
+static const char strength_weak_string[] = "weak";
+static const char dist_auto_string[] = "auto";
+
+/*
+ * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
+ * in the ancestor directory of prefetch_control.
+ *
+ * When initializing this driver, it is verified that the cache directory exists
+ * under cpuX device. Therefore, the third level up from prefetch_control is
+ * cpuX device as shown below.
+ *
+ * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
+ */
+static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
+{
+	return pfctl_dev->parent->parent->parent->id;
+}
+
+/* Enable the value written to the A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0 */
+static inline void enable_verify(u64 *reg)
+{
+	*reg |= NC_FIELD_PREP(A64FX_SDPF_V, 1);
+}
+
+static void _pfctl_read_mask(void *_reg)
+{
+	u64 *reg = (u64 *)_reg;
+
+	*reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+}
+
+static u64 pfctl_read_mask(unsigned int cpu, u64 mask)
+{
+	u64 reg;
+
+	smp_call_function_single(cpu, _pfctl_read_mask, &reg, true);
+
+	return NC_FIELD_GET(mask, reg);
+}
+
+struct write_info {
+	u64 mask;
+	u64 val;
+};
+
+static void _pfctl_write_mask(void *info)
+{
+	struct write_info *winfo = info;
+	u64 reg;
+
+	reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+
+	reg &= ~winfo->mask;
+	reg |= NC_FIELD_PREP(winfo->mask, winfo->val);
+
+	enable_verify(&reg);
+
+	write_sysreg_s(reg, A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+}
+
+static int pfctl_write_mask(unsigned int cpu, u64 mask, u64 val)
+{
+	struct write_info info = {
+		.mask	= mask,
+		.val	= val,
+	};
+
+	if (!NC_FIELD_FIT(mask, val))
+		return -EINVAL;
+
+	smp_call_function_single(cpu, _pfctl_write_mask, &info, true);
+
+	return 0;
+}
+
+static ssize_t
+pfctl_enable_show(struct device *pfctl_dev, struct device_attribute *attr,
+		char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	bool val;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	val = (bool)pfctl_read_mask(cpu, aa->mask);
+
+	return sysfs_emit(buf, "%d\n", val ? 0 : 1);
+}
+
+static ssize_t
+pfctl_enable_store(struct device *pfctl_dev, struct device_attribute *attr,
+		 const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	bool val;
+	int ret;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	if (strtobool(buf, &val) < 0)
+		return -EINVAL;
+
+	ret = pfctl_write_mask(cpu, aa->mask, val ? 0 : 1);
+	if (ret < 0)
+		return ret;
+
+	return size;
+}
+
+#define A64FX_PFCTL_ENABLE_ATTR(_level, _mask)				\
+	struct a64fx_pfctl_attr attr_l##_level##_enable = {		\
+		.attr = __ATTR(stream_detect_prefetcher_enable, 0600,	\
+			       pfctl_enable_show, pfctl_enable_store),	\
+		.mask = _mask, }
+
+
+static A64FX_PFCTL_ENABLE_ATTR(1, A64FX_SDPF_L1PF_DIS);
+static A64FX_PFCTL_ENABLE_ATTR(2, A64FX_SDPF_L2PF_DIS);
+
+static ssize_t
+pfctl_dist_show(struct device *pfctl_dev, struct device_attribute *attr,
+		char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 val, min_dist;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+	min_dist = (u64)aa->data;
+
+	val = pfctl_read_mask(cpu, aa->mask);
+
+	if (val == PFCTL_DIST_AUTO_VAL)
+		return sysfs_emit(buf, "%s\n", dist_auto_string);
+	else
+		return sysfs_emit(buf, "%llu\n", val * min_dist);
+}
+
+static ssize_t
+pfctl_dist_store(struct device *pfctl_dev, struct device_attribute *attr,
+		 const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 min_dist, val;
+	int ret;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+	min_dist = (u64)aa->data;
+
+	if (sysfs_streq(buf, dist_auto_string)) {
+		val = PFCTL_DIST_AUTO_VAL;
+	} else {
+		ret = kstrtoull(buf, 10, &val);
+		if (ret < 0 || val < 1)
+			return -EINVAL;
+	}
+
+	val = roundup(val, min_dist) / min_dist;
+
+	if (!NC_FIELD_FIT(aa->mask, val))
+		return -EINVAL;
+
+	ret = pfctl_write_mask(cpu, aa->mask, val);
+	if (ret < 0)
+		return ret;
+
+	return size;
+}
+
+#define PFCTL_DIST_ATTR(_level, _mask, _min_dist)			\
+	struct a64fx_pfctl_attr attr_l##_level##_dist = {		\
+		.attr = __ATTR(stream_detect_prefetcher_dist, 0600,	\
+			       pfctl_dist_show, pfctl_dist_store),	\
+		.mask = _mask,						\
+		.data = (void *)(u64)_min_dist, }
+
+static PFCTL_DIST_ATTR(1, A64FX_SDPF_L1_DIST, PFCTL_MIN_L1_DIST);
+static PFCTL_DIST_ATTR(2, A64FX_SDPF_L2_DIST, PFCTL_MIN_L2_DIST);
+
+static ssize_t
+pfctl_strength_show(struct device *pfctl_dev, struct device_attribute *attr,
+		    char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 val;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	val = pfctl_read_mask(cpu, aa->mask);
+
+	switch (val) {
+	case PFCTL_STRONG_VAL:
+		return sysfs_emit(buf, "%s\n", strength_strong_string);
+	case PFCTL_WEAK_VAL:
+		return sysfs_emit(buf, "%s\n", strength_weak_string);
+	default:
+		return -EINVAL;
+	}
+}
+
+static ssize_t
+pfctl_strength_store(struct device *pfctl_dev, struct device_attribute *attr,
+		     const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 val;
+	int ret;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	if (sysfs_streq(buf, strength_strong_string))
+		val = PFCTL_STRONG_VAL;
+	else if (sysfs_streq(buf, strength_weak_string))
+		val = PFCTL_WEAK_VAL;
+	else
+		return -EINVAL;
+
+	ret = pfctl_write_mask(cpu, aa->mask, val);
+	if (ret < 0)
+		return ret;
+
+	return size;
+}
+
+#define PFCTL_STRENGTH_ATTR(_level, _mask)				\
+	struct a64fx_pfctl_attr attr_l##_level##_strength = {		\
+		.attr = __ATTR(stream_detect_prefetcher_strength, 0600, \
+			       pfctl_strength_show,			\
+			       pfctl_strength_store),			\
+		.mask = _mask, }
+
+static PFCTL_STRENGTH_ATTR(1, A64FX_SDPF_L1W);
+static PFCTL_STRENGTH_ATTR(2, A64FX_SDPF_L2W);
+
+static ssize_t
+pfctl_strength_available_show(struct device *dev, struct device_attribute *attr,
+			      char *buf)
+{
+	return sysfs_emit(buf, "%s %s\n", strength_strong_string,
+			  strength_weak_string);
+}
+
+/*
+ * A64FX has same kind of available strength for any caches, so define only one
+ * attribute.
+ */
+struct a64fx_pfctl_attr attr_strength_available = {
+	.attr = __ATTR(stream_detect_prefetcher_strength_available, 0400,
+		       pfctl_strength_available_show, NULL), };
+
+static struct device_attribute *l1_attrs[] __ro_after_init = {
+	&attr_l1_enable.attr,
+	&attr_l1_dist.attr,
+	&attr_l1_strength.attr,
+	&attr_strength_available.attr,
+	NULL,
+};
+
+static struct device_attribute *l2_attrs[] __ro_after_init = {
+	&attr_l2_enable.attr,
+	&attr_l2_dist.attr,
+	&attr_l2_strength.attr,
+	&attr_strength_available.attr,
+	NULL,
+};
+
+static const struct pfctl_group a64fx_pfctl_groups[] = {
+	{
+		.level = 1,
+		.type = CACHE_TYPE_DATA,
+		.attrs = l1_attrs,
+	},
+	{
+		.level = 2,
+		.type = CACHE_TYPE_UNIFIED,
+		.attrs = l2_attrs,
+	},
+	{
+		.attrs = NULL,
+	},
+};
+
+/*
+ * This driver returns a negative value if it does not support the Hardware
+ * Prefetch Control or if it is running on a VM guest.
+ */
+static int __init a64fx_pfctl_init(void)
+{
+	unsigned long implementor = read_cpuid_implementor();
+	unsigned long part_number = read_cpuid_part_number();
+
+	if (!is_kernel_in_hyp_mode())
+		return -EINVAL;
+
+	if ((implementor != ARM_CPU_IMP_FUJITSU) ||
+	    (part_number != FUJITSU_CPU_PART_A64FX))
+		return -ENODEV;
+
+	return pfctl_register_attrs(a64fx_pfctl_groups);
+}
+
+static void __exit a64fx_pfctl_exit(void)
+{
+	return pfctl_unregister_attrs(a64fx_pfctl_groups);
+}
+
+late_initcall(a64fx_pfctl_init);
+module_exit(a64fx_pfctl_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("FUJITSU LIMITED");
+MODULE_DESCRIPTION("A64FX Hardware Prefetch Control Driver");
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 3/8] soc: fujitsu: Add hardware prefetch control support for A64FX
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds module init/exit code to create sysfs attributes for A64FX with
"stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
and "stream_detect_prefetcher_dist".

This driver works only if part number is FUJITSU_CPU_PART_A64FX. The
details of the registers to be read and written in this patch are
described below.

"https://github.com/fujitsu/A64FX/tree/master/doc/"
    A64FX_Specification_HPC_Extension_v1_EN.pdf

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 drivers/soc/fujitsu/a64fx-pfctl.c | 373 ++++++++++++++++++++++++++++++
 1 file changed, 373 insertions(+)
 create mode 100644 drivers/soc/fujitsu/a64fx-pfctl.c

diff --git a/drivers/soc/fujitsu/a64fx-pfctl.c b/drivers/soc/fujitsu/a64fx-pfctl.c
new file mode 100644
index 000000000000..f765f53987d4
--- /dev/null
+++ b/drivers/soc/fujitsu/a64fx-pfctl.c
@@ -0,0 +1,373 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * A64FX Hardware Prefetch Control support
+ */
+
+#include <asm/cputype.h>
+#include <linux/cacheinfo.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pfctl.h>
+#include <linux/smp.h>
+#include <linux/sysfs.h>
+
+/*
+ * Constants for these add the "A64FX_SDPF" prefix to the name described in
+ * section "1.3.4.2. IMP_PF_STREAM_DETECT_CTRL_EL0" of "A64FX specification".
+ * (https://github.com/fujitsu/A64FX/tree/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf")
+ * See this document for register specification details.
+ */
+#define A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0	sys_reg(3, 3, 11, 4, 0)
+#define A64FX_SDPF_V					BIT_ULL(63)
+#define A64FX_SDPF_L1PF_DIS				BIT_ULL(59)
+#define A64FX_SDPF_L2PF_DIS				BIT_ULL(58)
+#define A64FX_SDPF_L1W					BIT_ULL(55)
+#define A64FX_SDPF_L2W					BIT_ULL(54)
+#define A64FX_SDPF_L1_DIST				GENMASK_ULL(27, 24)
+#define A64FX_SDPF_L2_DIST				GENMASK_ULL(19, 16)
+
+#define PFCTL_MIN_L1_DIST				256
+#define PFCTL_MIN_L2_DIST				1024
+#define PFCTL_DIST_AUTO_VAL				0
+#define PFCTL_STRONG_VAL				0
+#define PFCTL_WEAK_VAL					1
+
+/*
+ * Define bitfield access macros for non-constant mask, because macros such as
+ * FIELD_FIT defined in include/linux/bitfield.h require constant mask.
+ */
+#define NC_FIELD_FIT(_mask, _val)					\
+	!((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask))
+
+#define NC_FIELD_PREP(_mask, _val)					\
+	(((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask))
+
+#define NC_FIELD_GET(_mask, _reg)					\
+	((typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)))
+
+struct a64fx_pfctl_attr {
+	struct device_attribute	attr;
+	u64			mask;
+	void			*data;
+};
+
+static const char strength_strong_string[] = "strong";
+static const char strength_weak_string[] = "weak";
+static const char dist_auto_string[] = "auto";
+
+/*
+ * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
+ * in the ancestor directory of prefetch_control.
+ *
+ * When initializing this driver, it is verified that the cache directory exists
+ * under cpuX device. Therefore, the third level up from prefetch_control is
+ * cpuX device as shown below.
+ *
+ * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
+ */
+static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
+{
+	return pfctl_dev->parent->parent->parent->id;
+}
+
+/* Enable the value written to the A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0 */
+static inline void enable_verify(u64 *reg)
+{
+	*reg |= NC_FIELD_PREP(A64FX_SDPF_V, 1);
+}
+
+static void _pfctl_read_mask(void *_reg)
+{
+	u64 *reg = (u64 *)_reg;
+
+	*reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+}
+
+static u64 pfctl_read_mask(unsigned int cpu, u64 mask)
+{
+	u64 reg;
+
+	smp_call_function_single(cpu, _pfctl_read_mask, &reg, true);
+
+	return NC_FIELD_GET(mask, reg);
+}
+
+struct write_info {
+	u64 mask;
+	u64 val;
+};
+
+static void _pfctl_write_mask(void *info)
+{
+	struct write_info *winfo = info;
+	u64 reg;
+
+	reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+
+	reg &= ~winfo->mask;
+	reg |= NC_FIELD_PREP(winfo->mask, winfo->val);
+
+	enable_verify(&reg);
+
+	write_sysreg_s(reg, A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
+}
+
+static int pfctl_write_mask(unsigned int cpu, u64 mask, u64 val)
+{
+	struct write_info info = {
+		.mask	= mask,
+		.val	= val,
+	};
+
+	if (!NC_FIELD_FIT(mask, val))
+		return -EINVAL;
+
+	smp_call_function_single(cpu, _pfctl_write_mask, &info, true);
+
+	return 0;
+}
+
+static ssize_t
+pfctl_enable_show(struct device *pfctl_dev, struct device_attribute *attr,
+		char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	bool val;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	val = (bool)pfctl_read_mask(cpu, aa->mask);
+
+	return sysfs_emit(buf, "%d\n", val ? 0 : 1);
+}
+
+static ssize_t
+pfctl_enable_store(struct device *pfctl_dev, struct device_attribute *attr,
+		 const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	bool val;
+	int ret;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	if (strtobool(buf, &val) < 0)
+		return -EINVAL;
+
+	ret = pfctl_write_mask(cpu, aa->mask, val ? 0 : 1);
+	if (ret < 0)
+		return ret;
+
+	return size;
+}
+
+#define A64FX_PFCTL_ENABLE_ATTR(_level, _mask)				\
+	struct a64fx_pfctl_attr attr_l##_level##_enable = {		\
+		.attr = __ATTR(stream_detect_prefetcher_enable, 0600,	\
+			       pfctl_enable_show, pfctl_enable_store),	\
+		.mask = _mask, }
+
+
+static A64FX_PFCTL_ENABLE_ATTR(1, A64FX_SDPF_L1PF_DIS);
+static A64FX_PFCTL_ENABLE_ATTR(2, A64FX_SDPF_L2PF_DIS);
+
+static ssize_t
+pfctl_dist_show(struct device *pfctl_dev, struct device_attribute *attr,
+		char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 val, min_dist;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+	min_dist = (u64)aa->data;
+
+	val = pfctl_read_mask(cpu, aa->mask);
+
+	if (val == PFCTL_DIST_AUTO_VAL)
+		return sysfs_emit(buf, "%s\n", dist_auto_string);
+	else
+		return sysfs_emit(buf, "%llu\n", val * min_dist);
+}
+
+static ssize_t
+pfctl_dist_store(struct device *pfctl_dev, struct device_attribute *attr,
+		 const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 min_dist, val;
+	int ret;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+	min_dist = (u64)aa->data;
+
+	if (sysfs_streq(buf, dist_auto_string)) {
+		val = PFCTL_DIST_AUTO_VAL;
+	} else {
+		ret = kstrtoull(buf, 10, &val);
+		if (ret < 0 || val < 1)
+			return -EINVAL;
+	}
+
+	val = roundup(val, min_dist) / min_dist;
+
+	if (!NC_FIELD_FIT(aa->mask, val))
+		return -EINVAL;
+
+	ret = pfctl_write_mask(cpu, aa->mask, val);
+	if (ret < 0)
+		return ret;
+
+	return size;
+}
+
+#define PFCTL_DIST_ATTR(_level, _mask, _min_dist)			\
+	struct a64fx_pfctl_attr attr_l##_level##_dist = {		\
+		.attr = __ATTR(stream_detect_prefetcher_dist, 0600,	\
+			       pfctl_dist_show, pfctl_dist_store),	\
+		.mask = _mask,						\
+		.data = (void *)(u64)_min_dist, }
+
+static PFCTL_DIST_ATTR(1, A64FX_SDPF_L1_DIST, PFCTL_MIN_L1_DIST);
+static PFCTL_DIST_ATTR(2, A64FX_SDPF_L2_DIST, PFCTL_MIN_L2_DIST);
+
+static ssize_t
+pfctl_strength_show(struct device *pfctl_dev, struct device_attribute *attr,
+		    char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 val;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	val = pfctl_read_mask(cpu, aa->mask);
+
+	switch (val) {
+	case PFCTL_STRONG_VAL:
+		return sysfs_emit(buf, "%s\n", strength_strong_string);
+	case PFCTL_WEAK_VAL:
+		return sysfs_emit(buf, "%s\n", strength_weak_string);
+	default:
+		return -EINVAL;
+	}
+}
+
+static ssize_t
+pfctl_strength_store(struct device *pfctl_dev, struct device_attribute *attr,
+		     const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct a64fx_pfctl_attr *aa;
+	u64 val;
+	int ret;
+
+	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
+
+	if (sysfs_streq(buf, strength_strong_string))
+		val = PFCTL_STRONG_VAL;
+	else if (sysfs_streq(buf, strength_weak_string))
+		val = PFCTL_WEAK_VAL;
+	else
+		return -EINVAL;
+
+	ret = pfctl_write_mask(cpu, aa->mask, val);
+	if (ret < 0)
+		return ret;
+
+	return size;
+}
+
+#define PFCTL_STRENGTH_ATTR(_level, _mask)				\
+	struct a64fx_pfctl_attr attr_l##_level##_strength = {		\
+		.attr = __ATTR(stream_detect_prefetcher_strength, 0600, \
+			       pfctl_strength_show,			\
+			       pfctl_strength_store),			\
+		.mask = _mask, }
+
+static PFCTL_STRENGTH_ATTR(1, A64FX_SDPF_L1W);
+static PFCTL_STRENGTH_ATTR(2, A64FX_SDPF_L2W);
+
+static ssize_t
+pfctl_strength_available_show(struct device *dev, struct device_attribute *attr,
+			      char *buf)
+{
+	return sysfs_emit(buf, "%s %s\n", strength_strong_string,
+			  strength_weak_string);
+}
+
+/*
+ * A64FX has same kind of available strength for any caches, so define only one
+ * attribute.
+ */
+struct a64fx_pfctl_attr attr_strength_available = {
+	.attr = __ATTR(stream_detect_prefetcher_strength_available, 0400,
+		       pfctl_strength_available_show, NULL), };
+
+static struct device_attribute *l1_attrs[] __ro_after_init = {
+	&attr_l1_enable.attr,
+	&attr_l1_dist.attr,
+	&attr_l1_strength.attr,
+	&attr_strength_available.attr,
+	NULL,
+};
+
+static struct device_attribute *l2_attrs[] __ro_after_init = {
+	&attr_l2_enable.attr,
+	&attr_l2_dist.attr,
+	&attr_l2_strength.attr,
+	&attr_strength_available.attr,
+	NULL,
+};
+
+static const struct pfctl_group a64fx_pfctl_groups[] = {
+	{
+		.level = 1,
+		.type = CACHE_TYPE_DATA,
+		.attrs = l1_attrs,
+	},
+	{
+		.level = 2,
+		.type = CACHE_TYPE_UNIFIED,
+		.attrs = l2_attrs,
+	},
+	{
+		.attrs = NULL,
+	},
+};
+
+/*
+ * This driver returns a negative value if it does not support the Hardware
+ * Prefetch Control or if it is running on a VM guest.
+ */
+static int __init a64fx_pfctl_init(void)
+{
+	unsigned long implementor = read_cpuid_implementor();
+	unsigned long part_number = read_cpuid_part_number();
+
+	if (!is_kernel_in_hyp_mode())
+		return -EINVAL;
+
+	if ((implementor != ARM_CPU_IMP_FUJITSU) ||
+	    (part_number != FUJITSU_CPU_PART_A64FX))
+		return -ENODEV;
+
+	return pfctl_register_attrs(a64fx_pfctl_groups);
+}
+
+static void __exit a64fx_pfctl_exit(void)
+{
+	return pfctl_unregister_attrs(a64fx_pfctl_groups);
+}
+
+late_initcall(a64fx_pfctl_init);
+module_exit(a64fx_pfctl_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("FUJITSU LIMITED");
+MODULE_DESCRIPTION("A64FX Hardware Prefetch Control Driver");
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 4/8] soc: fujitsu: Add Kconfig/Makefile to build hardware prefetch control driver
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds Kconfig/Makefile to build hardware prefetch control driver for
A64FX support, and also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS                  |  1 +
 drivers/soc/Kconfig          |  1 +
 drivers/soc/Makefile         |  1 +
 drivers/soc/fujitsu/Kconfig  | 11 +++++++++++
 drivers/soc/fujitsu/Makefile |  2 ++
 5 files changed, 16 insertions(+)
 create mode 100644 drivers/soc/fujitsu/Kconfig
 create mode 100644 drivers/soc/fujitsu/Makefile

diff --git a/MAINTAINERS b/MAINTAINERS
index f188403bc2e9..dd1345087881 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8624,6 +8624,7 @@ HARDWARE PREFETCH CONTROL DRIVERS
 M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
 S:	Maintained
 F:	drivers/base/pfctl.c
+F:	drivers/soc/fujitsu/a64fx-pfctl.c
 F:	include/linux/pfctl.h
 
 HARDWARE RANDOM NUMBER GENERATOR CORE
diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig
index c5aae42673d3..d87754799d90 100644
--- a/drivers/soc/Kconfig
+++ b/drivers/soc/Kconfig
@@ -9,6 +9,7 @@ source "drivers/soc/atmel/Kconfig"
 source "drivers/soc/bcm/Kconfig"
 source "drivers/soc/canaan/Kconfig"
 source "drivers/soc/fsl/Kconfig"
+source "drivers/soc/fujitsu/Kconfig"
 source "drivers/soc/imx/Kconfig"
 source "drivers/soc/ixp4xx/Kconfig"
 source "drivers/soc/litex/Kconfig"
diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile
index 904eec2a7871..6c8ff1792cda 100644
--- a/drivers/soc/Makefile
+++ b/drivers/soc/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_SOC_CANAAN)	+= canaan/
 obj-$(CONFIG_ARCH_DOVE)		+= dove/
 obj-$(CONFIG_MACH_DOVE)		+= dove/
 obj-y				+= fsl/
+obj-y				+= fujitsu/
 obj-$(CONFIG_ARCH_GEMINI)	+= gemini/
 obj-y				+= imx/
 obj-y				+= ixp4xx/
diff --git a/drivers/soc/fujitsu/Kconfig b/drivers/soc/fujitsu/Kconfig
new file mode 100644
index 000000000000..d711e6139840
--- /dev/null
+++ b/drivers/soc/fujitsu/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+menu "Fujitsu SoC drivers"
+
+config A64FX_HWPF_CONTROL
+	tristate "A64FX Hardware Prefetch Control driver"
+	depends on ARM64 && HWPF_CONTROL
+	help
+	  This adds Hardware Prefetch driver control support for A64FX.
+
+endmenu
diff --git a/drivers/soc/fujitsu/Makefile b/drivers/soc/fujitsu/Makefile
new file mode 100644
index 000000000000..35e284a548bb
--- /dev/null
+++ b/drivers/soc/fujitsu/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_A64FX_HWPF_CONTROL)	+= a64fx-pfctl.o
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 4/8] soc: fujitsu: Add Kconfig/Makefile to build hardware prefetch control driver
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds Kconfig/Makefile to build hardware prefetch control driver for
A64FX support, and also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS                  |  1 +
 drivers/soc/Kconfig          |  1 +
 drivers/soc/Makefile         |  1 +
 drivers/soc/fujitsu/Kconfig  | 11 +++++++++++
 drivers/soc/fujitsu/Makefile |  2 ++
 5 files changed, 16 insertions(+)
 create mode 100644 drivers/soc/fujitsu/Kconfig
 create mode 100644 drivers/soc/fujitsu/Makefile

diff --git a/MAINTAINERS b/MAINTAINERS
index f188403bc2e9..dd1345087881 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8624,6 +8624,7 @@ HARDWARE PREFETCH CONTROL DRIVERS
 M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
 S:	Maintained
 F:	drivers/base/pfctl.c
+F:	drivers/soc/fujitsu/a64fx-pfctl.c
 F:	include/linux/pfctl.h
 
 HARDWARE RANDOM NUMBER GENERATOR CORE
diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig
index c5aae42673d3..d87754799d90 100644
--- a/drivers/soc/Kconfig
+++ b/drivers/soc/Kconfig
@@ -9,6 +9,7 @@ source "drivers/soc/atmel/Kconfig"
 source "drivers/soc/bcm/Kconfig"
 source "drivers/soc/canaan/Kconfig"
 source "drivers/soc/fsl/Kconfig"
+source "drivers/soc/fujitsu/Kconfig"
 source "drivers/soc/imx/Kconfig"
 source "drivers/soc/ixp4xx/Kconfig"
 source "drivers/soc/litex/Kconfig"
diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile
index 904eec2a7871..6c8ff1792cda 100644
--- a/drivers/soc/Makefile
+++ b/drivers/soc/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_SOC_CANAAN)	+= canaan/
 obj-$(CONFIG_ARCH_DOVE)		+= dove/
 obj-$(CONFIG_MACH_DOVE)		+= dove/
 obj-y				+= fsl/
+obj-y				+= fujitsu/
 obj-$(CONFIG_ARCH_GEMINI)	+= gemini/
 obj-y				+= imx/
 obj-y				+= ixp4xx/
diff --git a/drivers/soc/fujitsu/Kconfig b/drivers/soc/fujitsu/Kconfig
new file mode 100644
index 000000000000..d711e6139840
--- /dev/null
+++ b/drivers/soc/fujitsu/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+menu "Fujitsu SoC drivers"
+
+config A64FX_HWPF_CONTROL
+	tristate "A64FX Hardware Prefetch Control driver"
+	depends on ARM64 && HWPF_CONTROL
+	help
+	  This adds Hardware Prefetch driver control support for A64FX.
+
+endmenu
diff --git a/drivers/soc/fujitsu/Makefile b/drivers/soc/fujitsu/Makefile
new file mode 100644
index 000000000000..35e284a548bb
--- /dev/null
+++ b/drivers/soc/fujitsu/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_A64FX_HWPF_CONTROL)	+= a64fx-pfctl.o
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Create a cache sysfs directory without ACPI PPTT if the
CONFIG_HWPF_CONTROL is true.

Hardware prefetch control driver need cache sysfs directory and cache
level/type information. In ARM processor, these information can be
obtained from the register even without PPTT.

This patch set the cpu_map_populated to true if the machine doesn't
have PPTT. It use only the level/type information obtained from
CLIDR_EL1, and don't use CCSIDR information.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 arch/arm64/kernel/cacheinfo.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 587543c6c51c..039ec32d0b3d 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -43,6 +43,21 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 	this_leaf->type = type;
 }
 
+#if defined(CONFIG_HWPF_CONTROL)
+static bool acpi_has_pptt(void)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status))
+		return false;
+
+	acpi_put_table(table);
+	return true;
+}
+#endif
+
 int init_cache_level(unsigned int cpu)
 {
 	unsigned int ctype, level, leaves, fw_level;
@@ -95,5 +110,19 @@ int populate_cache_leaves(unsigned int cpu)
 			ci_leaf_init(this_leaf++, type, level);
 		}
 	}
+
+#if defined(CONFIG_HWPF_CONTROL)
+	/*
+	 * Hardware prefetch functions need cache sysfs directory and cache
+	 * level/type information. In ARM processor, these information can be
+	 * obtained from registers even without PPTT. Therefore, we set the
+	 * cpu_map_populated to true to create cache sysfs directory, if the
+	 * machine doesn't have PPTT.
+	 **/
+	if (!acpi_disabled)
+		if (!acpi_has_pptt())
+			this_cpu_ci->cpu_map_populated = true;
+#endif
+
 	return 0;
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Create a cache sysfs directory without ACPI PPTT if the
CONFIG_HWPF_CONTROL is true.

Hardware prefetch control driver need cache sysfs directory and cache
level/type information. In ARM processor, these information can be
obtained from the register even without PPTT.

This patch set the cpu_map_populated to true if the machine doesn't
have PPTT. It use only the level/type information obtained from
CLIDR_EL1, and don't use CCSIDR information.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 arch/arm64/kernel/cacheinfo.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 587543c6c51c..039ec32d0b3d 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -43,6 +43,21 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 	this_leaf->type = type;
 }
 
+#if defined(CONFIG_HWPF_CONTROL)
+static bool acpi_has_pptt(void)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status))
+		return false;
+
+	acpi_put_table(table);
+	return true;
+}
+#endif
+
 int init_cache_level(unsigned int cpu)
 {
 	unsigned int ctype, level, leaves, fw_level;
@@ -95,5 +110,19 @@ int populate_cache_leaves(unsigned int cpu)
 			ci_leaf_init(this_leaf++, type, level);
 		}
 	}
+
+#if defined(CONFIG_HWPF_CONTROL)
+	/*
+	 * Hardware prefetch functions need cache sysfs directory and cache
+	 * level/type information. In ARM processor, these information can be
+	 * obtained from registers even without PPTT. Therefore, we set the
+	 * cpu_map_populated to true to create cache sysfs directory, if the
+	 * machine doesn't have PPTT.
+	 **/
+	if (!acpi_disabled)
+		if (!acpi_has_pptt())
+			this_cpu_ci->cpu_map_populated = true;
+#endif
+
 	return 0;
 }
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds module init/exit code to create sysfs attributes for x86 with
"hardware_prefetcher_enable", "ip_prefetcher_enable" and
"adjacent_cache_line_prefetcher_enable".

This driver works only if a CPU model is mapped to type of register
specification(e.g. TYPE_L12_BASE) in pfctl_match[].

The details of the registers(MSR_MISC_FEATURE_CONTROL) to be read and
written in this patch are described below:

"https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html"
    Volume 4

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 arch/x86/kernel/cpu/x86-pfctl.c | 258 ++++++++++++++++++++++++++++++++
 1 file changed, 258 insertions(+)
 create mode 100644 arch/x86/kernel/cpu/x86-pfctl.c

diff --git a/arch/x86/kernel/cpu/x86-pfctl.c b/arch/x86/kernel/cpu/x86-pfctl.c
new file mode 100644
index 000000000000..0b3b22128dff
--- /dev/null
+++ b/arch/x86/kernel/cpu/x86-pfctl.c
@@ -0,0 +1,258 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * x86 Hardware Prefetch Control support
+ */
+
+#include <linux/cacheinfo.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pfctl.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/sysfs.h>
+#include <asm/cpu_device_id.h>
+#include <asm/intel-family.h>
+#include <asm/msr.h>
+
+/*
+ * MSR_MISC_FEATURE_CONTROL has three type of register specifications.
+ *
+ * The register specification of TYPE_L12_BASE is as follow:
+ * [0]    L2 Hardware Prefetcher Disable (R/W)
+ * [1]    Reserved
+ * [2]    DCU Hardware Prefetcher Disable (R/W)
+ * [63:3] Reserved
+ *
+ * The register specification of TYPE_L12_PLUS is as follow:
+ * [0]    L2 Hardware Prefetcher Disable (R/W)
+ * [1]    L2 Adjacent Cache Line Prefetcher Disable (R/W)
+ * [2]    DCU Hardware Prefetcher Disable (R/W)
+ * [3]    DCU IP Prefetcher Disable (R/W)
+ * [63:4] Reserved
+ *
+ * The register specification of TYPE_L12_XPHI is as follow:
+ * [0]    L2 Hardware Prefetcher Disable (R/W)
+ * [1]    DCU Hardware Prefetcher Disable (R/W)
+ * [63:2] Reserved
+ *
+ * See "Intel 64 and IA-32 Architectures Software Developer's Manual"
+ * (https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)
+ * for register specification details.
+ */
+enum {
+	TYPE_L12_BASE,
+	TYPE_L12_PLUS,
+	TYPE_L12_XPHI,
+};
+
+struct x86_pfctl_attr {
+	struct device_attribute		attr;
+	u64				mask;
+};
+
+/*
+ * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
+ * in the ancestor directory of prefetch_control.
+ *
+ * When initializing this driver, it is verified that the cache directory exists
+ * under cpuX device. Therefore, the third level up from prefetch_control is
+ * cpuX device as shown below.
+ *
+ * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
+ */
+static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
+{
+	return pfctl_dev->parent->parent->parent->id;
+}
+
+static ssize_t
+pfctl_show(struct device *pfctl_dev, struct device_attribute *attr, char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct x86_pfctl_attr *xa;
+	u64 val;
+
+	xa = container_of(attr, struct x86_pfctl_attr, attr);
+
+	rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &val);
+	return sysfs_emit(buf, "%d\n", val & xa->mask ? 0 : 1);
+}
+
+struct write_info {
+	u64 mask;
+	bool enable;
+};
+
+/*
+ * wrmsrl() in this patch is only done inside of an interrupt-disabled region
+ * to avoid a conflict of write access from other drivers,
+ */
+static void pfctl_write(void *info)
+{
+	struct write_info *winfo = info;
+	u64 reg;
+
+	reg = 0;
+	rdmsrl(MSR_MISC_FEATURE_CONTROL, reg);
+
+	if (winfo->enable)
+		reg &= ~winfo->mask;
+	else
+		reg |= winfo->mask;
+
+	wrmsrl(MSR_MISC_FEATURE_CONTROL, reg);
+}
+
+/*
+ * MSR_MISC_FEATURE_CONTROL has "core" scope, so define the lock to avoid a
+ * conflict of write access from different logical processors in the same core.
+ */
+static DEFINE_MUTEX(pfctl_mutex);
+
+static ssize_t
+pfctl_store(struct device *pfctl_dev, struct device_attribute *attr,
+	    const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct x86_pfctl_attr *xa;
+	struct write_info info;
+
+	xa = container_of(attr, struct x86_pfctl_attr, attr);
+	info.mask = xa->mask;
+
+	if (strtobool(buf, &info.enable) < 0)
+		return -EINVAL;
+
+	mutex_lock(&pfctl_mutex);
+	smp_call_function_single(cpu, pfctl_write, &info, true);
+	mutex_unlock(&pfctl_mutex);
+
+	return size;
+}
+
+#define PFCTL_ATTR(_name, _level, _bit)					\
+	struct x86_pfctl_attr attr_l##_level##_##_name = {		\
+		.attr = __ATTR(_name, 0600, pfctl_show, pfctl_store),	\
+		.mask = BIT_ULL(_bit), }
+
+static PFCTL_ATTR(hardware_prefetcher_enable,			1, 2);
+static PFCTL_ATTR(hardware_prefetcher_enable,			2, 0);
+static PFCTL_ATTR(ip_prefetcher_enable,				1, 3);
+static PFCTL_ATTR(adjacent_cache_line_prefetcher_enable,	2, 1);
+
+static struct device_attribute *l1_attrs[] __ro_after_init = {
+	&attr_l1_hardware_prefetcher_enable.attr,
+	&attr_l1_ip_prefetcher_enable.attr,
+	NULL,
+};
+
+static struct device_attribute *l2_attrs[] __ro_after_init = {
+	&attr_l2_hardware_prefetcher_enable.attr,
+	&attr_l2_adjacent_cache_line_prefetcher_enable.attr,
+	NULL,
+};
+
+static const struct pfctl_group x86_pfctl_groups[] = {
+	{
+		.level = 1,
+		.type = CACHE_TYPE_DATA,
+		.attrs = l1_attrs,
+	},
+	{
+		.level = 2,
+		.type = CACHE_TYPE_UNIFIED,
+		.attrs = l2_attrs,
+	},
+	{
+		.attrs = NULL,
+	},
+};
+
+/*
+ * Only BROADWELL_X has been tested in the actual machine at this point. Other
+ * models were defined based on the information in the "Intel 64 and IA-32
+ * Architectures Software Developer's Manual"
+ */
+static const struct x86_cpu_id pfctl_match[] __initconst = {
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_SILVERMONT_D,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT_PLUS,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_G,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_EP,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_EX,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE_X,	TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL_G,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_G,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_D,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(CANNONLAKE_L,	TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL,	TYPE_L12_XPHI),
+	X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM,	TYPE_L12_XPHI),
+	{},
+};
+
+static int __init x86_pfctl_init(void)
+{
+	const struct x86_cpu_id *m;
+
+	if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
+		return -ENODEV;
+
+	m = x86_match_cpu(pfctl_match);
+	if (!m)
+		return -ENODEV;
+
+	switch (m->driver_data) {
+	case TYPE_L12_BASE:
+		l1_attrs[1] = NULL;
+		l2_attrs[1] = NULL;
+		break;
+	case TYPE_L12_PLUS:
+		break;
+	case TYPE_L12_XPHI:
+		attr_l1_hardware_prefetcher_enable.mask = BIT_ULL(1);
+		l1_attrs[1] = NULL;
+		l2_attrs[1] = NULL;
+		break;
+	default:
+		return -ENODEV;
+	};
+
+	return pfctl_register_attrs(x86_pfctl_groups);
+}
+
+static void __exit x86_pfctl_exit(void)
+{
+	return pfctl_unregister_attrs(x86_pfctl_groups);
+}
+
+late_initcall(x86_pfctl_init);
+module_exit(x86_pfctl_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("FUJITSU LIMITED");
+MODULE_DESCRIPTION("x86 Hardware Prefetch Control Driver");
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

Adds module init/exit code to create sysfs attributes for x86 with
"hardware_prefetcher_enable", "ip_prefetcher_enable" and
"adjacent_cache_line_prefetcher_enable".

This driver works only if a CPU model is mapped to type of register
specification(e.g. TYPE_L12_BASE) in pfctl_match[].

The details of the registers(MSR_MISC_FEATURE_CONTROL) to be read and
written in this patch are described below:

"https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html"
    Volume 4

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 arch/x86/kernel/cpu/x86-pfctl.c | 258 ++++++++++++++++++++++++++++++++
 1 file changed, 258 insertions(+)
 create mode 100644 arch/x86/kernel/cpu/x86-pfctl.c

diff --git a/arch/x86/kernel/cpu/x86-pfctl.c b/arch/x86/kernel/cpu/x86-pfctl.c
new file mode 100644
index 000000000000..0b3b22128dff
--- /dev/null
+++ b/arch/x86/kernel/cpu/x86-pfctl.c
@@ -0,0 +1,258 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2022 FUJITSU LIMITED
+ *
+ * x86 Hardware Prefetch Control support
+ */
+
+#include <linux/cacheinfo.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pfctl.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/sysfs.h>
+#include <asm/cpu_device_id.h>
+#include <asm/intel-family.h>
+#include <asm/msr.h>
+
+/*
+ * MSR_MISC_FEATURE_CONTROL has three type of register specifications.
+ *
+ * The register specification of TYPE_L12_BASE is as follow:
+ * [0]    L2 Hardware Prefetcher Disable (R/W)
+ * [1]    Reserved
+ * [2]    DCU Hardware Prefetcher Disable (R/W)
+ * [63:3] Reserved
+ *
+ * The register specification of TYPE_L12_PLUS is as follow:
+ * [0]    L2 Hardware Prefetcher Disable (R/W)
+ * [1]    L2 Adjacent Cache Line Prefetcher Disable (R/W)
+ * [2]    DCU Hardware Prefetcher Disable (R/W)
+ * [3]    DCU IP Prefetcher Disable (R/W)
+ * [63:4] Reserved
+ *
+ * The register specification of TYPE_L12_XPHI is as follow:
+ * [0]    L2 Hardware Prefetcher Disable (R/W)
+ * [1]    DCU Hardware Prefetcher Disable (R/W)
+ * [63:2] Reserved
+ *
+ * See "Intel 64 and IA-32 Architectures Software Developer's Manual"
+ * (https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html)
+ * for register specification details.
+ */
+enum {
+	TYPE_L12_BASE,
+	TYPE_L12_PLUS,
+	TYPE_L12_XPHI,
+};
+
+struct x86_pfctl_attr {
+	struct device_attribute		attr;
+	u64				mask;
+};
+
+/*
+ * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
+ * in the ancestor directory of prefetch_control.
+ *
+ * When initializing this driver, it is verified that the cache directory exists
+ * under cpuX device. Therefore, the third level up from prefetch_control is
+ * cpuX device as shown below.
+ *
+ * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
+ */
+static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
+{
+	return pfctl_dev->parent->parent->parent->id;
+}
+
+static ssize_t
+pfctl_show(struct device *pfctl_dev, struct device_attribute *attr, char *buf)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct x86_pfctl_attr *xa;
+	u64 val;
+
+	xa = container_of(attr, struct x86_pfctl_attr, attr);
+
+	rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &val);
+	return sysfs_emit(buf, "%d\n", val & xa->mask ? 0 : 1);
+}
+
+struct write_info {
+	u64 mask;
+	bool enable;
+};
+
+/*
+ * wrmsrl() in this patch is only done inside of an interrupt-disabled region
+ * to avoid a conflict of write access from other drivers,
+ */
+static void pfctl_write(void *info)
+{
+	struct write_info *winfo = info;
+	u64 reg;
+
+	reg = 0;
+	rdmsrl(MSR_MISC_FEATURE_CONTROL, reg);
+
+	if (winfo->enable)
+		reg &= ~winfo->mask;
+	else
+		reg |= winfo->mask;
+
+	wrmsrl(MSR_MISC_FEATURE_CONTROL, reg);
+}
+
+/*
+ * MSR_MISC_FEATURE_CONTROL has "core" scope, so define the lock to avoid a
+ * conflict of write access from different logical processors in the same core.
+ */
+static DEFINE_MUTEX(pfctl_mutex);
+
+static ssize_t
+pfctl_store(struct device *pfctl_dev, struct device_attribute *attr,
+	    const char *buf, size_t size)
+{
+	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
+	struct x86_pfctl_attr *xa;
+	struct write_info info;
+
+	xa = container_of(attr, struct x86_pfctl_attr, attr);
+	info.mask = xa->mask;
+
+	if (strtobool(buf, &info.enable) < 0)
+		return -EINVAL;
+
+	mutex_lock(&pfctl_mutex);
+	smp_call_function_single(cpu, pfctl_write, &info, true);
+	mutex_unlock(&pfctl_mutex);
+
+	return size;
+}
+
+#define PFCTL_ATTR(_name, _level, _bit)					\
+	struct x86_pfctl_attr attr_l##_level##_##_name = {		\
+		.attr = __ATTR(_name, 0600, pfctl_show, pfctl_store),	\
+		.mask = BIT_ULL(_bit), }
+
+static PFCTL_ATTR(hardware_prefetcher_enable,			1, 2);
+static PFCTL_ATTR(hardware_prefetcher_enable,			2, 0);
+static PFCTL_ATTR(ip_prefetcher_enable,				1, 3);
+static PFCTL_ATTR(adjacent_cache_line_prefetcher_enable,	2, 1);
+
+static struct device_attribute *l1_attrs[] __ro_after_init = {
+	&attr_l1_hardware_prefetcher_enable.attr,
+	&attr_l1_ip_prefetcher_enable.attr,
+	NULL,
+};
+
+static struct device_attribute *l2_attrs[] __ro_after_init = {
+	&attr_l2_hardware_prefetcher_enable.attr,
+	&attr_l2_adjacent_cache_line_prefetcher_enable.attr,
+	NULL,
+};
+
+static const struct pfctl_group x86_pfctl_groups[] = {
+	{
+		.level = 1,
+		.type = CACHE_TYPE_DATA,
+		.attrs = l1_attrs,
+	},
+	{
+		.level = 2,
+		.type = CACHE_TYPE_UNIFIED,
+		.attrs = l2_attrs,
+	},
+	{
+		.attrs = NULL,
+	},
+};
+
+/*
+ * Only BROADWELL_X has been tested in the actual machine at this point. Other
+ * models were defined based on the information in the "Intel 64 and IA-32
+ * Architectures Software Developer's Manual"
+ */
+static const struct x86_cpu_id pfctl_match[] __initconst = {
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_SILVERMONT_D,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT_PLUS,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D,	TYPE_L12_BASE),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_G,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_EP,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_EX,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE_X,	TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(HASWELL_G,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_G,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_D,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(CANNONLAKE_L,	TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE,		TYPE_L12_PLUS),
+	X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL,	TYPE_L12_XPHI),
+	X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM,	TYPE_L12_XPHI),
+	{},
+};
+
+static int __init x86_pfctl_init(void)
+{
+	const struct x86_cpu_id *m;
+
+	if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
+		return -ENODEV;
+
+	m = x86_match_cpu(pfctl_match);
+	if (!m)
+		return -ENODEV;
+
+	switch (m->driver_data) {
+	case TYPE_L12_BASE:
+		l1_attrs[1] = NULL;
+		l2_attrs[1] = NULL;
+		break;
+	case TYPE_L12_PLUS:
+		break;
+	case TYPE_L12_XPHI:
+		attr_l1_hardware_prefetcher_enable.mask = BIT_ULL(1);
+		l1_attrs[1] = NULL;
+		l2_attrs[1] = NULL;
+		break;
+	default:
+		return -ENODEV;
+	};
+
+	return pfctl_register_attrs(x86_pfctl_groups);
+}
+
+static void __exit x86_pfctl_exit(void)
+{
+	return pfctl_unregister_attrs(x86_pfctl_groups);
+}
+
+late_initcall(x86_pfctl_init);
+module_exit(x86_pfctl_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("FUJITSU LIMITED");
+MODULE_DESCRIPTION("x86 Hardware Prefetch Control Driver");
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

This adds Kconfig/Makefile to build hardware prefetch control driver
for x86 support. This also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS                  | 1 +
 arch/x86/Kconfig             | 6 ++++++
 arch/x86/kernel/cpu/Makefile | 2 ++
 3 files changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index dd1345087881..9759c3606c7d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8623,6 +8623,7 @@ K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
 HARDWARE PREFETCH CONTROL DRIVERS
 M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
 S:	Maintained
+F:	arch/x86/kernel/cpu/x86-pfctl.c
 F:	drivers/base/pfctl.c
 F:	drivers/soc/fujitsu/a64fx-pfctl.c
 F:	include/linux/pfctl.h
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4bed3abf444d..81df9efc69e5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1359,6 +1359,12 @@ config X86_CPUID
 	  with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to
 	  /dev/cpu/31/cpuid.
 
+config X86_HWPF_CONTROL
+	tristate "x86 Hardware Prefetch Control support"
+	depends on X86_64 && HWPF_CONTROL
+	help
+	  This adds Hardware Prefetch driver control support for X86.
+
 choice
 	prompt "High Memory Support"
 	default HIGHMEM4G
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 9661e3e802be..1aa13dad17a3 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -56,6 +56,8 @@ obj-$(CONFIG_X86_LOCAL_APIC)		+= perfctr-watchdog.o
 obj-$(CONFIG_HYPERVISOR_GUEST)		+= vmware.o hypervisor.o mshyperv.o
 obj-$(CONFIG_ACRN_GUEST)		+= acrn.o
 
+obj-$(CONFIG_X86_HWPF_CONTROL)		+= x86-pfctl.o
+
 ifdef CONFIG_X86_FEATURE_NAMES
 quiet_cmd_mkcapflags = MKCAP   $@
       cmd_mkcapflags = $(CONFIG_SHELL) $(srctree)/$(src)/mkcapflags.sh $@ $^
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

This adds Kconfig/Makefile to build hardware prefetch control driver
for x86 support. This also adds a MAINTAINERS entry.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 MAINTAINERS                  | 1 +
 arch/x86/Kconfig             | 6 ++++++
 arch/x86/kernel/cpu/Makefile | 2 ++
 3 files changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index dd1345087881..9759c3606c7d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8623,6 +8623,7 @@ K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
 HARDWARE PREFETCH CONTROL DRIVERS
 M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
 S:	Maintained
+F:	arch/x86/kernel/cpu/x86-pfctl.c
 F:	drivers/base/pfctl.c
 F:	drivers/soc/fujitsu/a64fx-pfctl.c
 F:	include/linux/pfctl.h
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4bed3abf444d..81df9efc69e5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1359,6 +1359,12 @@ config X86_CPUID
 	  with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to
 	  /dev/cpu/31/cpuid.
 
+config X86_HWPF_CONTROL
+	tristate "x86 Hardware Prefetch Control support"
+	depends on X86_64 && HWPF_CONTROL
+	help
+	  This adds Hardware Prefetch driver control support for X86.
+
 choice
 	prompt "High Memory Support"
 	default HIGHMEM4G
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 9661e3e802be..1aa13dad17a3 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -56,6 +56,8 @@ obj-$(CONFIG_X86_LOCAL_APIC)		+= perfctr-watchdog.o
 obj-$(CONFIG_HYPERVISOR_GUEST)		+= vmware.o hypervisor.o mshyperv.o
 obj-$(CONFIG_ACRN_GUEST)		+= acrn.o
 
+obj-$(CONFIG_X86_HWPF_CONTROL)		+= x86-pfctl.o
+
 ifdef CONFIG_X86_FEATURE_NAMES
 quiet_cmd_mkcapflags = MKCAP   $@
       cmd_mkcapflags = $(CONFIG_SHELL) $(srctree)/$(src)/mkcapflags.sh $@ $^
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 8/8] docs: ABI: Add sysfs documentation interface of hardware prefetch control driver
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-18  6:30   ` Kohei Tarumizu
  -1 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

This describes the sysfs interface implemented by the hardware prefetch
control driver.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 .../ABI/testing/sysfs-devices-system-cpu      | 98 +++++++++++++++++++
 1 file changed, 98 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2ad01cad7f1c..0da4c1bac51e 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -688,3 +688,101 @@ Description:
 		(RO) the list of CPUs that are isolated and don't
 		participate in load balancing. These CPUs are set by
 		boot parameter "isolcpus=".
+
+What:		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/hardware_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/ip_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/adjacent_cache_line_prefetcher_enable
+Date:		March 2022
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Parameters for some Intel CPU's hardware prefetch control
+
+		This sysfs interface provides Hardware Prefetch control
+		attribute for some Intel processors. Attributes are only
+		present if the particular cache implements the relevant
+		prefetcher controls.
+
+		*_prefetcher_enable:
+		    (RW) control this prefetcher's enablement state.
+		    Read returns current status:
+			0: this prefetcher is disabled
+			1: this prefetcher is enabled
+
+		- Attribute mapping
+
+		    Some Intel processors have MSR 0x1a4. This register has several
+		    specifications depending on the model. This interface provides
+		    a one-to-one attribute file to control all the tunable
+		    parameters the CPU provides of the following.
+
+			- "* Hardware Prefetcher Disable (R/W)"
+			    corresponds to the "hardware_prefetcher_enable"
+
+			- "* Adjacent Cache Line Prefetcher Disable (R/W)"
+			    corresponds to the "adjacent_cache_line_prefetcher_enable"
+
+			- "* IP Prefetcher Disable (R/W)"
+			    corresponds to the "ip_prefetcher_enable"
+
+What:		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_strength
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_strength_available
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_dist
+Date:		March 2022
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Parameters for A64FX's hardware prefetch control
+
+		This sysfs interface provides Hardware Prefetch control
+		attribute for the processor A64FX. Attributes are only
+		present if the particular cache implements the relevant
+		prefetcher controls.
+
+		stream_detect_prefetcher_enable:
+		    (RW) control the prefetcher's enablement state.
+		    Read returns current status:
+			0: this prefetcher is disabled
+			1: this prefetcher is enabled
+
+		stream_detect_prefetcher_strength:
+		    (RW) control the prefetcher operation's strongness state.
+		    Read returns current status:
+			weak: prefetch operation is weak
+			strong: prefetch operation is strong
+
+		    Strong prefetch operation is surely executed, if there is
+		    no corresponding data in cache.
+		    Weak prefetch operation allows the hardware not to execute
+		    operation depending on hardware state.
+
+
+		stream_detect_prefetcher_strength_available:
+		    (RO) displays a space separated list of available strongness
+		    state.
+
+		stream_detect_prefetcher_dist:
+		    (RW) control the prefetcher distance value.
+		    Read return current prefetcher distance value in bytes
+		    or the string "auto".
+
+		    Write either a value in byte or the string "auto" to this
+		    parameter. If you write a value less than multiples of a
+		    specific value, it is rounded up.
+
+		    The string "auto" have a special meaning. This means that
+		    instead of setting dist to a user-specified value, it
+		    operates using hardware-specific values.
+
+		- Attribute mapping
+
+		    The processor A64FX has register IMP_PF_STREAM_DETECT_CTRL_EL0
+		    for Hardware Prefetch Control. This attribute maps each
+		    specification to the following.
+
+			- "L*PF_DIS": enablement of hardware prefetcher
+			    corresponds to the "stream_detect_prefetcher_enable"
+
+			- "L*W": strongness of hardware prefetcher
+			    corresponds to "stream_detect_prefetcher_strength"
+			    and "stream_detect_prefetcher_strength_available"
+
+			- "L*_DIST": distance of hardware prefetcher
+			    corresponds to the "stream_detect_prefetcher_dist"
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 8/8] docs: ABI: Add sysfs documentation interface of hardware prefetch control driver
@ 2022-05-18  6:30   ` Kohei Tarumizu
  0 siblings, 0 replies; 50+ messages in thread
From: Kohei Tarumizu @ 2022-05-18  6:30 UTC (permalink / raw)
  To: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel
  Cc: tarumizu.kohei

This describes the sysfs interface implemented by the hardware prefetch
control driver.

Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
---
 .../ABI/testing/sysfs-devices-system-cpu      | 98 +++++++++++++++++++
 1 file changed, 98 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2ad01cad7f1c..0da4c1bac51e 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -688,3 +688,101 @@ Description:
 		(RO) the list of CPUs that are isolated and don't
 		participate in load balancing. These CPUs are set by
 		boot parameter "isolcpus=".
+
+What:		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/hardware_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/ip_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/adjacent_cache_line_prefetcher_enable
+Date:		March 2022
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Parameters for some Intel CPU's hardware prefetch control
+
+		This sysfs interface provides Hardware Prefetch control
+		attribute for some Intel processors. Attributes are only
+		present if the particular cache implements the relevant
+		prefetcher controls.
+
+		*_prefetcher_enable:
+		    (RW) control this prefetcher's enablement state.
+		    Read returns current status:
+			0: this prefetcher is disabled
+			1: this prefetcher is enabled
+
+		- Attribute mapping
+
+		    Some Intel processors have MSR 0x1a4. This register has several
+		    specifications depending on the model. This interface provides
+		    a one-to-one attribute file to control all the tunable
+		    parameters the CPU provides of the following.
+
+			- "* Hardware Prefetcher Disable (R/W)"
+			    corresponds to the "hardware_prefetcher_enable"
+
+			- "* Adjacent Cache Line Prefetcher Disable (R/W)"
+			    corresponds to the "adjacent_cache_line_prefetcher_enable"
+
+			- "* IP Prefetcher Disable (R/W)"
+			    corresponds to the "ip_prefetcher_enable"
+
+What:		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_enable
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_strength
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_strength_available
+		/sys/devices/system/cpu/cpu*/cache/index*/prefetch_control/stream_detect_prefetcher_dist
+Date:		March 2022
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Parameters for A64FX's hardware prefetch control
+
+		This sysfs interface provides Hardware Prefetch control
+		attribute for the processor A64FX. Attributes are only
+		present if the particular cache implements the relevant
+		prefetcher controls.
+
+		stream_detect_prefetcher_enable:
+		    (RW) control the prefetcher's enablement state.
+		    Read returns current status:
+			0: this prefetcher is disabled
+			1: this prefetcher is enabled
+
+		stream_detect_prefetcher_strength:
+		    (RW) control the prefetcher operation's strongness state.
+		    Read returns current status:
+			weak: prefetch operation is weak
+			strong: prefetch operation is strong
+
+		    Strong prefetch operation is surely executed, if there is
+		    no corresponding data in cache.
+		    Weak prefetch operation allows the hardware not to execute
+		    operation depending on hardware state.
+
+
+		stream_detect_prefetcher_strength_available:
+		    (RO) displays a space separated list of available strongness
+		    state.
+
+		stream_detect_prefetcher_dist:
+		    (RW) control the prefetcher distance value.
+		    Read return current prefetcher distance value in bytes
+		    or the string "auto".
+
+		    Write either a value in byte or the string "auto" to this
+		    parameter. If you write a value less than multiples of a
+		    specific value, it is rounded up.
+
+		    The string "auto" have a special meaning. This means that
+		    instead of setting dist to a user-specified value, it
+		    operates using hardware-specific values.
+
+		- Attribute mapping
+
+		    The processor A64FX has register IMP_PF_STREAM_DETECT_CTRL_EL0
+		    for Hardware Prefetch Control. This attribute maps each
+		    specification to the following.
+
+			- "L*PF_DIS": enablement of hardware prefetcher
+			    corresponds to the "stream_detect_prefetcher_enable"
+
+			- "L*W": strongness of hardware prefetcher
+			    corresponds to "stream_detect_prefetcher_strength"
+			    and "stream_detect_prefetcher_strength_available"
+
+			- "L*_DIST": distance of hardware prefetcher
+			    corresponds to the "stream_detect_prefetcher_dist"
-- 
2.27.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
  2022-05-18  6:30   ` Kohei Tarumizu
@ 2022-05-18  6:43     ` Greg KH
  -1 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  6:43 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

On Wed, May 18, 2022 at 03:30:30PM +0900, Kohei Tarumizu wrote:
> +/*
> + * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
> + * in the ancestor directory of prefetch_control.
> + *
> + * When initializing this driver, it is verified that the cache directory exists
> + * under cpuX device. Therefore, the third level up from prefetch_control is
> + * cpuX device as shown below.
> + *
> + * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
> + */
> +static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
> +{
> +	return pfctl_dev->parent->parent->parent->id;

As much fun as it is to see a function like this, that is not ok, and
never guaranteed to keep working, sorry.  Please find the device
properly, never assume a specific driver model topology as they are
guaranteed to break over time and never supposed to be static like this.



> +}
> +
> +static ssize_t
> +pfctl_show(struct device *pfctl_dev, struct device_attribute *attr, char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct x86_pfctl_attr *xa;
> +	u64 val;
> +
> +	xa = container_of(attr, struct x86_pfctl_attr, attr);
> +
> +	rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &val);
> +	return sysfs_emit(buf, "%d\n", val & xa->mask ? 0 : 1);
> +}
> +
> +struct write_info {
> +	u64 mask;
> +	bool enable;
> +};
> +
> +/*
> + * wrmsrl() in this patch is only done inside of an interrupt-disabled region
> + * to avoid a conflict of write access from other drivers,
> + */
> +static void pfctl_write(void *info)
> +{
> +	struct write_info *winfo = info;
> +	u64 reg;
> +
> +	reg = 0;
> +	rdmsrl(MSR_MISC_FEATURE_CONTROL, reg);
> +
> +	if (winfo->enable)
> +		reg &= ~winfo->mask;
> +	else
> +		reg |= winfo->mask;
> +
> +	wrmsrl(MSR_MISC_FEATURE_CONTROL, reg);
> +}
> +
> +/*
> + * MSR_MISC_FEATURE_CONTROL has "core" scope, so define the lock to avoid a
> + * conflict of write access from different logical processors in the same core.
> + */
> +static DEFINE_MUTEX(pfctl_mutex);
> +
> +static ssize_t
> +pfctl_store(struct device *pfctl_dev, struct device_attribute *attr,
> +	    const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct x86_pfctl_attr *xa;
> +	struct write_info info;
> +
> +	xa = container_of(attr, struct x86_pfctl_attr, attr);
> +	info.mask = xa->mask;
> +
> +	if (strtobool(buf, &info.enable) < 0)
> +		return -EINVAL;
> +
> +	mutex_lock(&pfctl_mutex);
> +	smp_call_function_single(cpu, pfctl_write, &info, true);
> +	mutex_unlock(&pfctl_mutex);
> +
> +	return size;
> +}
> +
> +#define PFCTL_ATTR(_name, _level, _bit)					\
> +	struct x86_pfctl_attr attr_l##_level##_##_name = {		\
> +		.attr = __ATTR(_name, 0600, pfctl_show, pfctl_store),	\
> +		.mask = BIT_ULL(_bit), }
> +
> +static PFCTL_ATTR(hardware_prefetcher_enable,			1, 2);
> +static PFCTL_ATTR(hardware_prefetcher_enable,			2, 0);
> +static PFCTL_ATTR(ip_prefetcher_enable,				1, 3);
> +static PFCTL_ATTR(adjacent_cache_line_prefetcher_enable,	2, 1);
> +
> +static struct device_attribute *l1_attrs[] __ro_after_init = {

How do you know attributes are to be marked read only after init?  Do
other parts of the kernel do that?  I don't think the driver core
guarantees that at all.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
@ 2022-05-18  6:43     ` Greg KH
  0 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  6:43 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

On Wed, May 18, 2022 at 03:30:30PM +0900, Kohei Tarumizu wrote:
> +/*
> + * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
> + * in the ancestor directory of prefetch_control.
> + *
> + * When initializing this driver, it is verified that the cache directory exists
> + * under cpuX device. Therefore, the third level up from prefetch_control is
> + * cpuX device as shown below.
> + *
> + * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
> + */
> +static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
> +{
> +	return pfctl_dev->parent->parent->parent->id;

As much fun as it is to see a function like this, that is not ok, and
never guaranteed to keep working, sorry.  Please find the device
properly, never assume a specific driver model topology as they are
guaranteed to break over time and never supposed to be static like this.



> +}
> +
> +static ssize_t
> +pfctl_show(struct device *pfctl_dev, struct device_attribute *attr, char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct x86_pfctl_attr *xa;
> +	u64 val;
> +
> +	xa = container_of(attr, struct x86_pfctl_attr, attr);
> +
> +	rdmsrl_on_cpu(cpu, MSR_MISC_FEATURE_CONTROL, &val);
> +	return sysfs_emit(buf, "%d\n", val & xa->mask ? 0 : 1);
> +}
> +
> +struct write_info {
> +	u64 mask;
> +	bool enable;
> +};
> +
> +/*
> + * wrmsrl() in this patch is only done inside of an interrupt-disabled region
> + * to avoid a conflict of write access from other drivers,
> + */
> +static void pfctl_write(void *info)
> +{
> +	struct write_info *winfo = info;
> +	u64 reg;
> +
> +	reg = 0;
> +	rdmsrl(MSR_MISC_FEATURE_CONTROL, reg);
> +
> +	if (winfo->enable)
> +		reg &= ~winfo->mask;
> +	else
> +		reg |= winfo->mask;
> +
> +	wrmsrl(MSR_MISC_FEATURE_CONTROL, reg);
> +}
> +
> +/*
> + * MSR_MISC_FEATURE_CONTROL has "core" scope, so define the lock to avoid a
> + * conflict of write access from different logical processors in the same core.
> + */
> +static DEFINE_MUTEX(pfctl_mutex);
> +
> +static ssize_t
> +pfctl_store(struct device *pfctl_dev, struct device_attribute *attr,
> +	    const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct x86_pfctl_attr *xa;
> +	struct write_info info;
> +
> +	xa = container_of(attr, struct x86_pfctl_attr, attr);
> +	info.mask = xa->mask;
> +
> +	if (strtobool(buf, &info.enable) < 0)
> +		return -EINVAL;
> +
> +	mutex_lock(&pfctl_mutex);
> +	smp_call_function_single(cpu, pfctl_write, &info, true);
> +	mutex_unlock(&pfctl_mutex);
> +
> +	return size;
> +}
> +
> +#define PFCTL_ATTR(_name, _level, _bit)					\
> +	struct x86_pfctl_attr attr_l##_level##_##_name = {		\
> +		.attr = __ATTR(_name, 0600, pfctl_show, pfctl_store),	\
> +		.mask = BIT_ULL(_bit), }
> +
> +static PFCTL_ATTR(hardware_prefetcher_enable,			1, 2);
> +static PFCTL_ATTR(hardware_prefetcher_enable,			2, 0);
> +static PFCTL_ATTR(ip_prefetcher_enable,				1, 3);
> +static PFCTL_ATTR(adjacent_cache_line_prefetcher_enable,	2, 1);
> +
> +static struct device_attribute *l1_attrs[] __ro_after_init = {

How do you know attributes are to be marked read only after init?  Do
other parts of the kernel do that?  I don't think the driver core
guarantees that at all.

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver
  2022-05-18  6:30   ` Kohei Tarumizu
@ 2022-05-18  6:43     ` Greg KH
  -1 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  6:43 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

On Wed, May 18, 2022 at 03:30:31PM +0900, Kohei Tarumizu wrote:
> This adds Kconfig/Makefile to build hardware prefetch control driver
> for x86 support. This also adds a MAINTAINERS entry.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  MAINTAINERS                  | 1 +
>  arch/x86/Kconfig             | 6 ++++++
>  arch/x86/kernel/cpu/Makefile | 2 ++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index dd1345087881..9759c3606c7d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8623,6 +8623,7 @@ K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
>  HARDWARE PREFETCH CONTROL DRIVERS
>  M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
>  S:	Maintained
> +F:	arch/x86/kernel/cpu/x86-pfctl.c
>  F:	drivers/base/pfctl.c
>  F:	drivers/soc/fujitsu/a64fx-pfctl.c
>  F:	include/linux/pfctl.h
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 4bed3abf444d..81df9efc69e5 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1359,6 +1359,12 @@ config X86_CPUID
>  	  with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to
>  	  /dev/cpu/31/cpuid.
>  
> +config X86_HWPF_CONTROL
> +	tristate "x86 Hardware Prefetch Control support"
> +	depends on X86_64 && HWPF_CONTROL
> +	help
> +	  This adds Hardware Prefetch driver control support for X86.
> +

You need a lot more text here about what this is, what it does, why you
would want it, and the module name it will create.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver
@ 2022-05-18  6:43     ` Greg KH
  0 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  6:43 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

On Wed, May 18, 2022 at 03:30:31PM +0900, Kohei Tarumizu wrote:
> This adds Kconfig/Makefile to build hardware prefetch control driver
> for x86 support. This also adds a MAINTAINERS entry.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  MAINTAINERS                  | 1 +
>  arch/x86/Kconfig             | 6 ++++++
>  arch/x86/kernel/cpu/Makefile | 2 ++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index dd1345087881..9759c3606c7d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8623,6 +8623,7 @@ K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
>  HARDWARE PREFETCH CONTROL DRIVERS
>  M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
>  S:	Maintained
> +F:	arch/x86/kernel/cpu/x86-pfctl.c
>  F:	drivers/base/pfctl.c
>  F:	drivers/soc/fujitsu/a64fx-pfctl.c
>  F:	include/linux/pfctl.h
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 4bed3abf444d..81df9efc69e5 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1359,6 +1359,12 @@ config X86_CPUID
>  	  with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to
>  	  /dev/cpu/31/cpuid.
>  
> +config X86_HWPF_CONTROL
> +	tristate "x86 Hardware Prefetch Control support"
> +	depends on X86_64 && HWPF_CONTROL
> +	help
> +	  This adds Hardware Prefetch driver control support for X86.
> +

You need a lot more text here about what this is, what it does, why you
would want it, and the module name it will create.

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
  2022-05-18  6:30   ` Kohei Tarumizu
@ 2022-05-18  6:44     ` Greg KH
  -1 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  6:44 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

On Wed, May 18, 2022 at 03:30:30PM +0900, Kohei Tarumizu wrote:
> Adds module init/exit code to create sysfs attributes for x86 with
> "hardware_prefetcher_enable", "ip_prefetcher_enable" and
> "adjacent_cache_line_prefetcher_enable".
> 
> This driver works only if a CPU model is mapped to type of register
> specification(e.g. TYPE_L12_BASE) in pfctl_match[].

How will the driver be automatically loaded if this is the case or not?
You can not rely on userspace knowing to load a driver on its own,
please tie into the proper device discovery.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
@ 2022-05-18  6:44     ` Greg KH
  0 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  6:44 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

On Wed, May 18, 2022 at 03:30:30PM +0900, Kohei Tarumizu wrote:
> Adds module init/exit code to create sysfs attributes for x86 with
> "hardware_prefetcher_enable", "ip_prefetcher_enable" and
> "adjacent_cache_line_prefetcher_enable".
> 
> This driver works only if a CPU model is mapped to type of register
> specification(e.g. TYPE_L12_BASE) in pfctl_match[].

How will the driver be automatically loaded if this is the case or not?
You can not rely on userspace knowing to load a driver on its own,
please tie into the proper device discovery.

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/8] drivers: base: Add Kconfig/Makefile to build hardware prefetch control core driver
  2022-05-18  6:30   ` Kohei Tarumizu
@ 2022-05-18  7:04     ` Greg KH
  -1 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:04 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

On Wed, May 18, 2022 at 03:30:26PM +0900, Kohei Tarumizu wrote:
> Adds Kconfig/Makefile to build hardware prefetch control core driver,
> and also adds a MAINTAINERS entry.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  MAINTAINERS           | 6 ++++++
>  drivers/base/Kconfig  | 9 +++++++++
>  drivers/base/Makefile | 1 +
>  3 files changed, 16 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d6d879cb0afd..f188403bc2e9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8620,6 +8620,12 @@ F:	include/linux/hwmon*.h
>  F:	include/trace/events/hwmon*.h
>  K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
>  
> +HARDWARE PREFETCH CONTROL DRIVERS
> +M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> +S:	Maintained
> +F:	drivers/base/pfctl.c
> +F:	include/linux/pfctl.h
> +
>  HARDWARE RANDOM NUMBER GENERATOR CORE
>  M:	Matt Mackall <mpm@selenic.com>
>  M:	Herbert Xu <herbert@gondor.apana.org.au>
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 6f04b831a5c0..8f8a69e7f645 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -230,4 +230,13 @@ config GENERIC_ARCH_NUMA
>  	  Enable support for generic NUMA implementation. Currently, RISC-V
>  	  and ARM64 use it.
>  
> +config HWPF_CONTROL

Shouldn't this have "GENERIC" in the name liks the other generic
implementations in this directory?


> +	bool "Hardware Prefetch Control driver"
> +	help
> +	  This driver allows user to control CPU's Hardware Prefetch behavior.
> +	  If the machine supports this behavior, it provides a sysfs interface.

But this is not a driver, it's a core function that a driver uses.  On
its own, this code does nothing.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/8] drivers: base: Add Kconfig/Makefile to build hardware prefetch control core driver
@ 2022-05-18  7:04     ` Greg KH
  0 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:04 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

On Wed, May 18, 2022 at 03:30:26PM +0900, Kohei Tarumizu wrote:
> Adds Kconfig/Makefile to build hardware prefetch control core driver,
> and also adds a MAINTAINERS entry.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  MAINTAINERS           | 6 ++++++
>  drivers/base/Kconfig  | 9 +++++++++
>  drivers/base/Makefile | 1 +
>  3 files changed, 16 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d6d879cb0afd..f188403bc2e9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8620,6 +8620,12 @@ F:	include/linux/hwmon*.h
>  F:	include/trace/events/hwmon*.h
>  K:	(devm_)?hwmon_device_(un)?register(|_with_groups|_with_info)
>  
> +HARDWARE PREFETCH CONTROL DRIVERS
> +M:	Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> +S:	Maintained
> +F:	drivers/base/pfctl.c
> +F:	include/linux/pfctl.h
> +
>  HARDWARE RANDOM NUMBER GENERATOR CORE
>  M:	Matt Mackall <mpm@selenic.com>
>  M:	Herbert Xu <herbert@gondor.apana.org.au>
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 6f04b831a5c0..8f8a69e7f645 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -230,4 +230,13 @@ config GENERIC_ARCH_NUMA
>  	  Enable support for generic NUMA implementation. Currently, RISC-V
>  	  and ARM64 use it.
>  
> +config HWPF_CONTROL

Shouldn't this have "GENERIC" in the name liks the other generic
implementations in this directory?


> +	bool "Hardware Prefetch Control driver"
> +	help
> +	  This driver allows user to control CPU's Hardware Prefetch behavior.
> +	  If the machine supports this behavior, it provides a sysfs interface.

But this is not a driver, it's a core function that a driver uses.  On
its own, this code does nothing.

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 1/8] drivers: base: Add hardware prefetch control core driver
  2022-05-18  6:30   ` Kohei Tarumizu
@ 2022-05-18  7:09     ` Greg KH
  -1 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:09 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

On Wed, May 18, 2022 at 03:30:25PM +0900, Kohei Tarumizu wrote:
> Adds a register/unregister function to provide sysfs interface to
> control CPU's hardware prefetch behavior. It creates the
> "prefetch_control" sysfs directory and some attributes.
> 
> Attributes are hardware dependent, so it must be implemented for each
> hardware. If CPU has a hardware prefetch behavior, call this function
> to create sysfs.
> 
> Following patches add support for A64FX and x86.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  drivers/base/pfctl.c  | 180 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pfctl.h |  14 ++++
>  2 files changed, 194 insertions(+)
>  create mode 100644 drivers/base/pfctl.c
>  create mode 100644 include/linux/pfctl.h
> 
> diff --git a/drivers/base/pfctl.c b/drivers/base/pfctl.c
> new file mode 100644
> index 000000000000..08ee8faaf277
> --- /dev/null
> +++ b/drivers/base/pfctl.c
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hardware prefetch control support via sysfs.
> + *
> + * Copyright 2022 FUJITSU LIMITED
> + *
> + * See Documentation/ABI/testing/sysfs-devices-system-cpu for more information.
> + */
> +
> +#include <linux/cacheinfo.h>
> +#include <linux/cpu.h>
> +#include <linux/device.h>
> +#include <linux/pfctl.h>
> +#include <linux/slab.h>
> +
> +#ifdef pr_fmt
> +#undef pr_fmt
> +#endif
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +const struct pfctl_group *pgroups;
> +enum cpuhp_state hp_online;
> +
> +static struct device_attribute **
> +get_pfctl_attribute(unsigned int level, enum cache_type type)
> +{
> +	int i;
> +
> +	for (i = 0; pgroups[i].attrs; i++)
> +		if ((level == pgroups[i].level) && (type == pgroups[i].type))
> +			return pgroups[i].attrs;
> +
> +	return NULL;
> +}
> +
> +static int remove_pfctl_attr(struct device *index_dev, void *data)
> +{
> +	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
> +	struct device_attribute **attrs;
> +	struct device *pfctl_dev;
> +	int i;
> +
> +	attrs = get_pfctl_attribute(leaf->level, leaf->type);
> +	if (!attrs)
> +		return 0;
> +
> +	pfctl_dev = device_find_child_by_name(index_dev, "prefetch_control");
> +	if (!pfctl_dev)
> +		return 0;
> +
> +	for (i = 0; attrs[i]; i++)
> +		device_remove_file(pfctl_dev, attrs[i]);

This feels wrong, attributes should be groups and be automatically added
and removed by the driver core that way.  Not as lists of attributes
like this, as that will race and be wrong.

Use a list of attribute groups please.

> +
> +	device_unregister(pfctl_dev);
> +	put_device(pfctl_dev);
> +
> +	pfctl_dev = NULL;
> +
> +	return 0;
> +}
> +
> +static int create_pfctl_attr(struct device *index_dev, void *data)
> +{
> +	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
> +	struct device_attribute **attrs;
> +	struct device *pfctl_dev;
> +	int i, err;
> +
> +	attrs = get_pfctl_attribute(leaf->level, leaf->type);
> +	if (!attrs)
> +		return 0;
> +
> +	pfctl_dev = cpu_device_create(index_dev, NULL, NULL,
> +				      "prefetch_control");
> +	if (IS_ERR(pfctl_dev))
> +		return PTR_ERR(pfctl_dev);
> +
> +	for (i = 0; attrs[i]; i++) {
> +		err = device_create_file(pfctl_dev, attrs[i]);

You just raced with userspace and lost :(

Please use attribute groups instead of manually adding files after the
device is created and userspace is notified that it was present.

That also makes your clean up logic much simpler (i.e. none as the drive
core did it for you already.)

> +		if (err) {
> +			while (--i >= 0)
> +				device_remove_file(pfctl_dev, attrs[i]);
> +
> +			device_unregister(pfctl_dev);
> +			pfctl_dev = NULL;
> +
> +			return err;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int pfctl_online(unsigned int cpu)
> +{
> +	struct device *cpu_dev = get_cpu_device(cpu);
> +	struct device *cache_dev;
> +	int ret;
> +
> +	cache_dev = device_find_child_by_name(cpu_dev, "cache");
> +	if (!cache_dev)
> +		return -ENODEV;
> +
> +	ret = device_for_each_child(cache_dev, NULL, create_pfctl_attr);
> +	if (ret < 0)
> +		device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
> +
> +	put_device(cache_dev);
> +
> +	return ret;
> +}
> +
> +static int pfctl_prepare_down(unsigned int cpu)
> +{
> +	struct device *cpu_dev = get_cpu_device(cpu);
> +	struct device *cache_dev;
> +
> +	cache_dev = device_find_child_by_name(cpu_dev, "cache");
> +	if (!cache_dev)
> +		return 0;
> +
> +	device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
> +
> +	put_device(cache_dev);
> +
> +	return 0;
> +}
> +
> +/**
> + * pfctl_register_attrs - register a Hardware Prefetch Control attributes
> + * @pfctl_groups: pfctl_groups contains device attribute group to control the
> + *                hardware prefetch register.
> + *
> + * Note: Call this function after the cache device is initialized because it
> + * requires access to the cache device. (e.g. Call at the late_initcall)
> + *
> + * Context: Any context.
> + * Return: 0 on success, negative error code on failure.
> + */
> +int pfctl_register_attrs(const struct pfctl_group *pfctl_groups)
> +{
> +	int ret;
> +
> +	if (pgroups)
> +		return -EEXIST;
> +
> +	pgroups = pfctl_groups;
> +
> +	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "base/pfctl:online",
> +				pfctl_online, pfctl_prepare_down);
> +	if (ret < 0) {
> +		pr_err("failed to register hotplug callbacks\n");
> +		pgroups = NULL;
> +		return ret;
> +	}
> +
> +	hp_online = ret;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(pfctl_register_attrs);
> +
> +/**
> + * pfctl_unregister_attrs - unregister the Hardware Prefetch Control driver
> + * @pfctl_groups: Used to verify that this function is called by the same driver
> + *                that called pfctl_register_attrs.
> + *
> + * Context: Any context.
> + * Return: nothing.
> + */
> +void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups)
> +{
> +	if (!pgroups || (pfctl_groups != pgroups))
> +		return;
> +
> +	cpuhp_remove_state(hp_online);
> +
> +	pgroups = NULL;
> +}
> +EXPORT_SYMBOL_GPL(pfctl_unregister_attrs);
> diff --git a/include/linux/pfctl.h b/include/linux/pfctl.h
> new file mode 100644
> index 000000000000..ecdab78be09f
> --- /dev/null
> +++ b/include/linux/pfctl.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_PFCTL_H
> +#define _LINUX_PFCTL_H
> +
> +struct pfctl_group {
> +	unsigned int		level;
> +	enum cache_type		type;
> +	struct device_attribute	**attrs;

Attribute groups please.

> +};
> +
> +int pfctl_register_attrs(const struct pfctl_group *pfctl_groups);
> +void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups);

Why do you think this needs to be in the driver core?  Why isn't this
just a normal cpu driver?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 1/8] drivers: base: Add hardware prefetch control core driver
@ 2022-05-18  7:09     ` Greg KH
  0 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:09 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

On Wed, May 18, 2022 at 03:30:25PM +0900, Kohei Tarumizu wrote:
> Adds a register/unregister function to provide sysfs interface to
> control CPU's hardware prefetch behavior. It creates the
> "prefetch_control" sysfs directory and some attributes.
> 
> Attributes are hardware dependent, so it must be implemented for each
> hardware. If CPU has a hardware prefetch behavior, call this function
> to create sysfs.
> 
> Following patches add support for A64FX and x86.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  drivers/base/pfctl.c  | 180 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pfctl.h |  14 ++++
>  2 files changed, 194 insertions(+)
>  create mode 100644 drivers/base/pfctl.c
>  create mode 100644 include/linux/pfctl.h
> 
> diff --git a/drivers/base/pfctl.c b/drivers/base/pfctl.c
> new file mode 100644
> index 000000000000..08ee8faaf277
> --- /dev/null
> +++ b/drivers/base/pfctl.c
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hardware prefetch control support via sysfs.
> + *
> + * Copyright 2022 FUJITSU LIMITED
> + *
> + * See Documentation/ABI/testing/sysfs-devices-system-cpu for more information.
> + */
> +
> +#include <linux/cacheinfo.h>
> +#include <linux/cpu.h>
> +#include <linux/device.h>
> +#include <linux/pfctl.h>
> +#include <linux/slab.h>
> +
> +#ifdef pr_fmt
> +#undef pr_fmt
> +#endif
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +const struct pfctl_group *pgroups;
> +enum cpuhp_state hp_online;
> +
> +static struct device_attribute **
> +get_pfctl_attribute(unsigned int level, enum cache_type type)
> +{
> +	int i;
> +
> +	for (i = 0; pgroups[i].attrs; i++)
> +		if ((level == pgroups[i].level) && (type == pgroups[i].type))
> +			return pgroups[i].attrs;
> +
> +	return NULL;
> +}
> +
> +static int remove_pfctl_attr(struct device *index_dev, void *data)
> +{
> +	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
> +	struct device_attribute **attrs;
> +	struct device *pfctl_dev;
> +	int i;
> +
> +	attrs = get_pfctl_attribute(leaf->level, leaf->type);
> +	if (!attrs)
> +		return 0;
> +
> +	pfctl_dev = device_find_child_by_name(index_dev, "prefetch_control");
> +	if (!pfctl_dev)
> +		return 0;
> +
> +	for (i = 0; attrs[i]; i++)
> +		device_remove_file(pfctl_dev, attrs[i]);

This feels wrong, attributes should be groups and be automatically added
and removed by the driver core that way.  Not as lists of attributes
like this, as that will race and be wrong.

Use a list of attribute groups please.

> +
> +	device_unregister(pfctl_dev);
> +	put_device(pfctl_dev);
> +
> +	pfctl_dev = NULL;
> +
> +	return 0;
> +}
> +
> +static int create_pfctl_attr(struct device *index_dev, void *data)
> +{
> +	struct cacheinfo *leaf = dev_get_drvdata(index_dev);
> +	struct device_attribute **attrs;
> +	struct device *pfctl_dev;
> +	int i, err;
> +
> +	attrs = get_pfctl_attribute(leaf->level, leaf->type);
> +	if (!attrs)
> +		return 0;
> +
> +	pfctl_dev = cpu_device_create(index_dev, NULL, NULL,
> +				      "prefetch_control");
> +	if (IS_ERR(pfctl_dev))
> +		return PTR_ERR(pfctl_dev);
> +
> +	for (i = 0; attrs[i]; i++) {
> +		err = device_create_file(pfctl_dev, attrs[i]);

You just raced with userspace and lost :(

Please use attribute groups instead of manually adding files after the
device is created and userspace is notified that it was present.

That also makes your clean up logic much simpler (i.e. none as the drive
core did it for you already.)

> +		if (err) {
> +			while (--i >= 0)
> +				device_remove_file(pfctl_dev, attrs[i]);
> +
> +			device_unregister(pfctl_dev);
> +			pfctl_dev = NULL;
> +
> +			return err;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int pfctl_online(unsigned int cpu)
> +{
> +	struct device *cpu_dev = get_cpu_device(cpu);
> +	struct device *cache_dev;
> +	int ret;
> +
> +	cache_dev = device_find_child_by_name(cpu_dev, "cache");
> +	if (!cache_dev)
> +		return -ENODEV;
> +
> +	ret = device_for_each_child(cache_dev, NULL, create_pfctl_attr);
> +	if (ret < 0)
> +		device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
> +
> +	put_device(cache_dev);
> +
> +	return ret;
> +}
> +
> +static int pfctl_prepare_down(unsigned int cpu)
> +{
> +	struct device *cpu_dev = get_cpu_device(cpu);
> +	struct device *cache_dev;
> +
> +	cache_dev = device_find_child_by_name(cpu_dev, "cache");
> +	if (!cache_dev)
> +		return 0;
> +
> +	device_for_each_child(cache_dev, NULL, remove_pfctl_attr);
> +
> +	put_device(cache_dev);
> +
> +	return 0;
> +}
> +
> +/**
> + * pfctl_register_attrs - register a Hardware Prefetch Control attributes
> + * @pfctl_groups: pfctl_groups contains device attribute group to control the
> + *                hardware prefetch register.
> + *
> + * Note: Call this function after the cache device is initialized because it
> + * requires access to the cache device. (e.g. Call at the late_initcall)
> + *
> + * Context: Any context.
> + * Return: 0 on success, negative error code on failure.
> + */
> +int pfctl_register_attrs(const struct pfctl_group *pfctl_groups)
> +{
> +	int ret;
> +
> +	if (pgroups)
> +		return -EEXIST;
> +
> +	pgroups = pfctl_groups;
> +
> +	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "base/pfctl:online",
> +				pfctl_online, pfctl_prepare_down);
> +	if (ret < 0) {
> +		pr_err("failed to register hotplug callbacks\n");
> +		pgroups = NULL;
> +		return ret;
> +	}
> +
> +	hp_online = ret;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(pfctl_register_attrs);
> +
> +/**
> + * pfctl_unregister_attrs - unregister the Hardware Prefetch Control driver
> + * @pfctl_groups: Used to verify that this function is called by the same driver
> + *                that called pfctl_register_attrs.
> + *
> + * Context: Any context.
> + * Return: nothing.
> + */
> +void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups)
> +{
> +	if (!pgroups || (pfctl_groups != pgroups))
> +		return;
> +
> +	cpuhp_remove_state(hp_online);
> +
> +	pgroups = NULL;
> +}
> +EXPORT_SYMBOL_GPL(pfctl_unregister_attrs);
> diff --git a/include/linux/pfctl.h b/include/linux/pfctl.h
> new file mode 100644
> index 000000000000..ecdab78be09f
> --- /dev/null
> +++ b/include/linux/pfctl.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_PFCTL_H
> +#define _LINUX_PFCTL_H
> +
> +struct pfctl_group {
> +	unsigned int		level;
> +	enum cache_type		type;
> +	struct device_attribute	**attrs;

Attribute groups please.

> +};
> +
> +int pfctl_register_attrs(const struct pfctl_group *pfctl_groups);
> +void pfctl_unregister_attrs(const struct pfctl_group *pfctl_groups);

Why do you think this needs to be in the driver core?  Why isn't this
just a normal cpu driver?

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
  2022-05-18  6:30   ` Kohei Tarumizu
@ 2022-05-18  7:09     ` Greg KH
  -1 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:09 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

On Wed, May 18, 2022 at 03:30:29PM +0900, Kohei Tarumizu wrote:
> Create a cache sysfs directory without ACPI PPTT if the
> CONFIG_HWPF_CONTROL is true.
> 
> Hardware prefetch control driver need cache sysfs directory and cache
> level/type information. In ARM processor, these information can be
> obtained from the register even without PPTT.
> 
> This patch set the cpu_map_populated to true if the machine doesn't
> have PPTT. It use only the level/type information obtained from
> CLIDR_EL1, and don't use CCSIDR information.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  arch/arm64/kernel/cacheinfo.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
> index 587543c6c51c..039ec32d0b3d 100644
> --- a/arch/arm64/kernel/cacheinfo.c
> +++ b/arch/arm64/kernel/cacheinfo.c
> @@ -43,6 +43,21 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
>  	this_leaf->type = type;
>  }
>  
> +#if defined(CONFIG_HWPF_CONTROL)

Please do not put #if in .c files.

> +static bool acpi_has_pptt(void)
> +{
> +	struct acpi_table_header *table;
> +	acpi_status status;
> +
> +	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +	if (ACPI_FAILURE(status))
> +		return false;
> +
> +	acpi_put_table(table);
> +	return true;
> +}
> +#endif
> +
>  int init_cache_level(unsigned int cpu)
>  {
>  	unsigned int ctype, level, leaves, fw_level;
> @@ -95,5 +110,19 @@ int populate_cache_leaves(unsigned int cpu)
>  			ci_leaf_init(this_leaf++, type, level);
>  		}
>  	}
> +
> +#if defined(CONFIG_HWPF_CONTROL)
> +	/*
> +	 * Hardware prefetch functions need cache sysfs directory and cache
> +	 * level/type information. In ARM processor, these information can be
> +	 * obtained from registers even without PPTT. Therefore, we set the
> +	 * cpu_map_populated to true to create cache sysfs directory, if the
> +	 * machine doesn't have PPTT.
> +	 **/
> +	if (!acpi_disabled)
> +		if (!acpi_has_pptt())
> +			this_cpu_ci->cpu_map_populated = true;
> +#endif

Same here, no #if is needed if you do it properly in your .h file.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
@ 2022-05-18  7:09     ` Greg KH
  0 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:09 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

On Wed, May 18, 2022 at 03:30:29PM +0900, Kohei Tarumizu wrote:
> Create a cache sysfs directory without ACPI PPTT if the
> CONFIG_HWPF_CONTROL is true.
> 
> Hardware prefetch control driver need cache sysfs directory and cache
> level/type information. In ARM processor, these information can be
> obtained from the register even without PPTT.
> 
> This patch set the cpu_map_populated to true if the machine doesn't
> have PPTT. It use only the level/type information obtained from
> CLIDR_EL1, and don't use CCSIDR information.
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  arch/arm64/kernel/cacheinfo.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
> index 587543c6c51c..039ec32d0b3d 100644
> --- a/arch/arm64/kernel/cacheinfo.c
> +++ b/arch/arm64/kernel/cacheinfo.c
> @@ -43,6 +43,21 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
>  	this_leaf->type = type;
>  }
>  
> +#if defined(CONFIG_HWPF_CONTROL)

Please do not put #if in .c files.

> +static bool acpi_has_pptt(void)
> +{
> +	struct acpi_table_header *table;
> +	acpi_status status;
> +
> +	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +	if (ACPI_FAILURE(status))
> +		return false;
> +
> +	acpi_put_table(table);
> +	return true;
> +}
> +#endif
> +
>  int init_cache_level(unsigned int cpu)
>  {
>  	unsigned int ctype, level, leaves, fw_level;
> @@ -95,5 +110,19 @@ int populate_cache_leaves(unsigned int cpu)
>  			ci_leaf_init(this_leaf++, type, level);
>  		}
>  	}
> +
> +#if defined(CONFIG_HWPF_CONTROL)
> +	/*
> +	 * Hardware prefetch functions need cache sysfs directory and cache
> +	 * level/type information. In ARM processor, these information can be
> +	 * obtained from registers even without PPTT. Therefore, we set the
> +	 * cpu_map_populated to true to create cache sysfs directory, if the
> +	 * machine doesn't have PPTT.
> +	 **/
> +	if (!acpi_disabled)
> +		if (!acpi_has_pptt())
> +			this_cpu_ci->cpu_map_populated = true;
> +#endif

Same here, no #if is needed if you do it properly in your .h file.

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 3/8] soc: fujitsu: Add hardware prefetch control support for A64FX
  2022-05-18  6:30   ` Kohei Tarumizu
@ 2022-05-18  7:10     ` Greg KH
  -1 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:10 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

On Wed, May 18, 2022 at 03:30:27PM +0900, Kohei Tarumizu wrote:
> Adds module init/exit code to create sysfs attributes for A64FX with
> "stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
> and "stream_detect_prefetcher_dist".
> 
> This driver works only if part number is FUJITSU_CPU_PART_A64FX. The
> details of the registers to be read and written in this patch are
> described below.
> 
> "https://github.com/fujitsu/A64FX/tree/master/doc/"
>     A64FX_Specification_HPC_Extension_v1_EN.pdf
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  drivers/soc/fujitsu/a64fx-pfctl.c | 373 ++++++++++++++++++++++++++++++
>  1 file changed, 373 insertions(+)
>  create mode 100644 drivers/soc/fujitsu/a64fx-pfctl.c
> 
> diff --git a/drivers/soc/fujitsu/a64fx-pfctl.c b/drivers/soc/fujitsu/a64fx-pfctl.c
> new file mode 100644
> index 000000000000..f765f53987d4
> --- /dev/null
> +++ b/drivers/soc/fujitsu/a64fx-pfctl.c
> @@ -0,0 +1,373 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright 2022 FUJITSU LIMITED
> + *
> + * A64FX Hardware Prefetch Control support
> + */
> +
> +#include <asm/cputype.h>
> +#include <linux/cacheinfo.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/pfctl.h>
> +#include <linux/smp.h>
> +#include <linux/sysfs.h>
> +
> +/*
> + * Constants for these add the "A64FX_SDPF" prefix to the name described in
> + * section "1.3.4.2. IMP_PF_STREAM_DETECT_CTRL_EL0" of "A64FX specification".
> + * (https://github.com/fujitsu/A64FX/tree/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf")
> + * See this document for register specification details.
> + */
> +#define A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0	sys_reg(3, 3, 11, 4, 0)
> +#define A64FX_SDPF_V					BIT_ULL(63)
> +#define A64FX_SDPF_L1PF_DIS				BIT_ULL(59)
> +#define A64FX_SDPF_L2PF_DIS				BIT_ULL(58)
> +#define A64FX_SDPF_L1W					BIT_ULL(55)
> +#define A64FX_SDPF_L2W					BIT_ULL(54)
> +#define A64FX_SDPF_L1_DIST				GENMASK_ULL(27, 24)
> +#define A64FX_SDPF_L2_DIST				GENMASK_ULL(19, 16)
> +
> +#define PFCTL_MIN_L1_DIST				256
> +#define PFCTL_MIN_L2_DIST				1024
> +#define PFCTL_DIST_AUTO_VAL				0
> +#define PFCTL_STRONG_VAL				0
> +#define PFCTL_WEAK_VAL					1
> +
> +/*
> + * Define bitfield access macros for non-constant mask, because macros such as
> + * FIELD_FIT defined in include/linux/bitfield.h require constant mask.
> + */
> +#define NC_FIELD_FIT(_mask, _val)					\
> +	!((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask))
> +
> +#define NC_FIELD_PREP(_mask, _val)					\
> +	(((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask))
> +
> +#define NC_FIELD_GET(_mask, _reg)					\
> +	((typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)))
> +
> +struct a64fx_pfctl_attr {
> +	struct device_attribute	attr;
> +	u64			mask;
> +	void			*data;
> +};
> +
> +static const char strength_strong_string[] = "strong";
> +static const char strength_weak_string[] = "weak";
> +static const char dist_auto_string[] = "auto";
> +
> +/*
> + * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
> + * in the ancestor directory of prefetch_control.
> + *
> + * When initializing this driver, it is verified that the cache directory exists
> + * under cpuX device. Therefore, the third level up from prefetch_control is
> + * cpuX device as shown below.
> + *
> + * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
> + */
> +static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
> +{
> +	return pfctl_dev->parent->parent->parent->id;
> +}
> +
> +/* Enable the value written to the A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0 */
> +static inline void enable_verify(u64 *reg)
> +{
> +	*reg |= NC_FIELD_PREP(A64FX_SDPF_V, 1);
> +}
> +
> +static void _pfctl_read_mask(void *_reg)
> +{
> +	u64 *reg = (u64 *)_reg;
> +
> +	*reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
> +}
> +
> +static u64 pfctl_read_mask(unsigned int cpu, u64 mask)
> +{
> +	u64 reg;
> +
> +	smp_call_function_single(cpu, _pfctl_read_mask, &reg, true);
> +
> +	return NC_FIELD_GET(mask, reg);
> +}
> +
> +struct write_info {
> +	u64 mask;
> +	u64 val;
> +};
> +
> +static void _pfctl_write_mask(void *info)
> +{
> +	struct write_info *winfo = info;
> +	u64 reg;
> +
> +	reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
> +
> +	reg &= ~winfo->mask;
> +	reg |= NC_FIELD_PREP(winfo->mask, winfo->val);
> +
> +	enable_verify(&reg);
> +
> +	write_sysreg_s(reg, A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
> +}
> +
> +static int pfctl_write_mask(unsigned int cpu, u64 mask, u64 val)
> +{
> +	struct write_info info = {
> +		.mask	= mask,
> +		.val	= val,
> +	};
> +
> +	if (!NC_FIELD_FIT(mask, val))
> +		return -EINVAL;
> +
> +	smp_call_function_single(cpu, _pfctl_write_mask, &info, true);
> +
> +	return 0;
> +}
> +
> +static ssize_t
> +pfctl_enable_show(struct device *pfctl_dev, struct device_attribute *attr,
> +		char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	bool val;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	val = (bool)pfctl_read_mask(cpu, aa->mask);
> +
> +	return sysfs_emit(buf, "%d\n", val ? 0 : 1);
> +}
> +
> +static ssize_t
> +pfctl_enable_store(struct device *pfctl_dev, struct device_attribute *attr,
> +		 const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	bool val;
> +	int ret;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	if (strtobool(buf, &val) < 0)
> +		return -EINVAL;
> +
> +	ret = pfctl_write_mask(cpu, aa->mask, val ? 0 : 1);
> +	if (ret < 0)
> +		return ret;
> +
> +	return size;
> +}
> +
> +#define A64FX_PFCTL_ENABLE_ATTR(_level, _mask)				\
> +	struct a64fx_pfctl_attr attr_l##_level##_enable = {		\
> +		.attr = __ATTR(stream_detect_prefetcher_enable, 0600,	\
> +			       pfctl_enable_show, pfctl_enable_store),	\
> +		.mask = _mask, }
> +
> +
> +static A64FX_PFCTL_ENABLE_ATTR(1, A64FX_SDPF_L1PF_DIS);
> +static A64FX_PFCTL_ENABLE_ATTR(2, A64FX_SDPF_L2PF_DIS);
> +
> +static ssize_t
> +pfctl_dist_show(struct device *pfctl_dev, struct device_attribute *attr,
> +		char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 val, min_dist;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +	min_dist = (u64)aa->data;
> +
> +	val = pfctl_read_mask(cpu, aa->mask);
> +
> +	if (val == PFCTL_DIST_AUTO_VAL)
> +		return sysfs_emit(buf, "%s\n", dist_auto_string);
> +	else
> +		return sysfs_emit(buf, "%llu\n", val * min_dist);
> +}
> +
> +static ssize_t
> +pfctl_dist_store(struct device *pfctl_dev, struct device_attribute *attr,
> +		 const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 min_dist, val;
> +	int ret;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +	min_dist = (u64)aa->data;
> +
> +	if (sysfs_streq(buf, dist_auto_string)) {
> +		val = PFCTL_DIST_AUTO_VAL;
> +	} else {
> +		ret = kstrtoull(buf, 10, &val);
> +		if (ret < 0 || val < 1)
> +			return -EINVAL;
> +	}
> +
> +	val = roundup(val, min_dist) / min_dist;
> +
> +	if (!NC_FIELD_FIT(aa->mask, val))
> +		return -EINVAL;
> +
> +	ret = pfctl_write_mask(cpu, aa->mask, val);
> +	if (ret < 0)
> +		return ret;
> +
> +	return size;
> +}
> +
> +#define PFCTL_DIST_ATTR(_level, _mask, _min_dist)			\
> +	struct a64fx_pfctl_attr attr_l##_level##_dist = {		\
> +		.attr = __ATTR(stream_detect_prefetcher_dist, 0600,	\
> +			       pfctl_dist_show, pfctl_dist_store),	\
> +		.mask = _mask,						\
> +		.data = (void *)(u64)_min_dist, }
> +
> +static PFCTL_DIST_ATTR(1, A64FX_SDPF_L1_DIST, PFCTL_MIN_L1_DIST);
> +static PFCTL_DIST_ATTR(2, A64FX_SDPF_L2_DIST, PFCTL_MIN_L2_DIST);
> +
> +static ssize_t
> +pfctl_strength_show(struct device *pfctl_dev, struct device_attribute *attr,
> +		    char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 val;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	val = pfctl_read_mask(cpu, aa->mask);
> +
> +	switch (val) {
> +	case PFCTL_STRONG_VAL:
> +		return sysfs_emit(buf, "%s\n", strength_strong_string);
> +	case PFCTL_WEAK_VAL:
> +		return sysfs_emit(buf, "%s\n", strength_weak_string);
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +static ssize_t
> +pfctl_strength_store(struct device *pfctl_dev, struct device_attribute *attr,
> +		     const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 val;
> +	int ret;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	if (sysfs_streq(buf, strength_strong_string))
> +		val = PFCTL_STRONG_VAL;
> +	else if (sysfs_streq(buf, strength_weak_string))
> +		val = PFCTL_WEAK_VAL;
> +	else
> +		return -EINVAL;
> +
> +	ret = pfctl_write_mask(cpu, aa->mask, val);
> +	if (ret < 0)
> +		return ret;
> +
> +	return size;
> +}
> +
> +#define PFCTL_STRENGTH_ATTR(_level, _mask)				\
> +	struct a64fx_pfctl_attr attr_l##_level##_strength = {		\
> +		.attr = __ATTR(stream_detect_prefetcher_strength, 0600, \
> +			       pfctl_strength_show,			\
> +			       pfctl_strength_store),			\
> +		.mask = _mask, }
> +
> +static PFCTL_STRENGTH_ATTR(1, A64FX_SDPF_L1W);
> +static PFCTL_STRENGTH_ATTR(2, A64FX_SDPF_L2W);
> +
> +static ssize_t
> +pfctl_strength_available_show(struct device *dev, struct device_attribute *attr,
> +			      char *buf)
> +{
> +	return sysfs_emit(buf, "%s %s\n", strength_strong_string,
> +			  strength_weak_string);
> +}
> +
> +/*
> + * A64FX has same kind of available strength for any caches, so define only one
> + * attribute.
> + */
> +struct a64fx_pfctl_attr attr_strength_available = {
> +	.attr = __ATTR(stream_detect_prefetcher_strength_available, 0400,
> +		       pfctl_strength_available_show, NULL), };
> +
> +static struct device_attribute *l1_attrs[] __ro_after_init = {
> +	&attr_l1_enable.attr,
> +	&attr_l1_dist.attr,
> +	&attr_l1_strength.attr,
> +	&attr_strength_available.attr,
> +	NULL,
> +};
> +
> +static struct device_attribute *l2_attrs[] __ro_after_init = {
> +	&attr_l2_enable.attr,
> +	&attr_l2_dist.attr,
> +	&attr_l2_strength.attr,
> +	&attr_strength_available.attr,
> +	NULL,
> +};
> +
> +static const struct pfctl_group a64fx_pfctl_groups[] = {
> +	{
> +		.level = 1,
> +		.type = CACHE_TYPE_DATA,
> +		.attrs = l1_attrs,
> +	},
> +	{
> +		.level = 2,
> +		.type = CACHE_TYPE_UNIFIED,
> +		.attrs = l2_attrs,
> +	},
> +	{
> +		.attrs = NULL,
> +	},
> +};
> +
> +/*
> + * This driver returns a negative value if it does not support the Hardware
> + * Prefetch Control or if it is running on a VM guest.
> + */
> +static int __init a64fx_pfctl_init(void)
> +{
> +	unsigned long implementor = read_cpuid_implementor();
> +	unsigned long part_number = read_cpuid_part_number();
> +
> +	if (!is_kernel_in_hyp_mode())
> +		return -EINVAL;
> +
> +	if ((implementor != ARM_CPU_IMP_FUJITSU) ||
> +	    (part_number != FUJITSU_CPU_PART_A64FX))
> +		return -ENODEV;
> +
> +	return pfctl_register_attrs(a64fx_pfctl_groups);
> +}
> +
> +static void __exit a64fx_pfctl_exit(void)
> +{
> +	return pfctl_unregister_attrs(a64fx_pfctl_groups);
> +}
> +
> +late_initcall(a64fx_pfctl_init);
> +module_exit(a64fx_pfctl_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("FUJITSU LIMITED");
> +MODULE_DESCRIPTION("A64FX Hardware Prefetch Control Driver");
> -- 
> 2.27.0
> 

How will this driver be properly loaded by the system if the hardware is
present?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 3/8] soc: fujitsu: Add hardware prefetch control support for A64FX
@ 2022-05-18  7:10     ` Greg KH
  0 siblings, 0 replies; 50+ messages in thread
From: Greg KH @ 2022-05-18  7:10 UTC (permalink / raw)
  To: Kohei Tarumizu
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

On Wed, May 18, 2022 at 03:30:27PM +0900, Kohei Tarumizu wrote:
> Adds module init/exit code to create sysfs attributes for A64FX with
> "stream_detect_prefetcher_enable", "stream_detect_prefetcher_strong"
> and "stream_detect_prefetcher_dist".
> 
> This driver works only if part number is FUJITSU_CPU_PART_A64FX. The
> details of the registers to be read and written in this patch are
> described below.
> 
> "https://github.com/fujitsu/A64FX/tree/master/doc/"
>     A64FX_Specification_HPC_Extension_v1_EN.pdf
> 
> Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
> ---
>  drivers/soc/fujitsu/a64fx-pfctl.c | 373 ++++++++++++++++++++++++++++++
>  1 file changed, 373 insertions(+)
>  create mode 100644 drivers/soc/fujitsu/a64fx-pfctl.c
> 
> diff --git a/drivers/soc/fujitsu/a64fx-pfctl.c b/drivers/soc/fujitsu/a64fx-pfctl.c
> new file mode 100644
> index 000000000000..f765f53987d4
> --- /dev/null
> +++ b/drivers/soc/fujitsu/a64fx-pfctl.c
> @@ -0,0 +1,373 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright 2022 FUJITSU LIMITED
> + *
> + * A64FX Hardware Prefetch Control support
> + */
> +
> +#include <asm/cputype.h>
> +#include <linux/cacheinfo.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/pfctl.h>
> +#include <linux/smp.h>
> +#include <linux/sysfs.h>
> +
> +/*
> + * Constants for these add the "A64FX_SDPF" prefix to the name described in
> + * section "1.3.4.2. IMP_PF_STREAM_DETECT_CTRL_EL0" of "A64FX specification".
> + * (https://github.com/fujitsu/A64FX/tree/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf")
> + * See this document for register specification details.
> + */
> +#define A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0	sys_reg(3, 3, 11, 4, 0)
> +#define A64FX_SDPF_V					BIT_ULL(63)
> +#define A64FX_SDPF_L1PF_DIS				BIT_ULL(59)
> +#define A64FX_SDPF_L2PF_DIS				BIT_ULL(58)
> +#define A64FX_SDPF_L1W					BIT_ULL(55)
> +#define A64FX_SDPF_L2W					BIT_ULL(54)
> +#define A64FX_SDPF_L1_DIST				GENMASK_ULL(27, 24)
> +#define A64FX_SDPF_L2_DIST				GENMASK_ULL(19, 16)
> +
> +#define PFCTL_MIN_L1_DIST				256
> +#define PFCTL_MIN_L2_DIST				1024
> +#define PFCTL_DIST_AUTO_VAL				0
> +#define PFCTL_STRONG_VAL				0
> +#define PFCTL_WEAK_VAL					1
> +
> +/*
> + * Define bitfield access macros for non-constant mask, because macros such as
> + * FIELD_FIT defined in include/linux/bitfield.h require constant mask.
> + */
> +#define NC_FIELD_FIT(_mask, _val)					\
> +	!((((typeof(_mask))_val) << __bf_shf(_mask)) & ~(_mask))
> +
> +#define NC_FIELD_PREP(_mask, _val)					\
> +	(((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask))
> +
> +#define NC_FIELD_GET(_mask, _reg)					\
> +	((typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)))
> +
> +struct a64fx_pfctl_attr {
> +	struct device_attribute	attr;
> +	u64			mask;
> +	void			*data;
> +};
> +
> +static const char strength_strong_string[] = "strong";
> +static const char strength_weak_string[] = "weak";
> +static const char dist_auto_string[] = "auto";
> +
> +/*
> + * Returns the cpu number of the cpu_device(/sys/devices/system/cpu/cpuX)
> + * in the ancestor directory of prefetch_control.
> + *
> + * When initializing this driver, it is verified that the cache directory exists
> + * under cpuX device. Therefore, the third level up from prefetch_control is
> + * cpuX device as shown below.
> + *
> + * /sys/devices/system/cpu/cpuX/cache/indexX/prefetch_control
> + */
> +static inline unsigned int pfctl_dev_get_cpu(struct device *pfctl_dev)
> +{
> +	return pfctl_dev->parent->parent->parent->id;
> +}
> +
> +/* Enable the value written to the A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0 */
> +static inline void enable_verify(u64 *reg)
> +{
> +	*reg |= NC_FIELD_PREP(A64FX_SDPF_V, 1);
> +}
> +
> +static void _pfctl_read_mask(void *_reg)
> +{
> +	u64 *reg = (u64 *)_reg;
> +
> +	*reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
> +}
> +
> +static u64 pfctl_read_mask(unsigned int cpu, u64 mask)
> +{
> +	u64 reg;
> +
> +	smp_call_function_single(cpu, _pfctl_read_mask, &reg, true);
> +
> +	return NC_FIELD_GET(mask, reg);
> +}
> +
> +struct write_info {
> +	u64 mask;
> +	u64 val;
> +};
> +
> +static void _pfctl_write_mask(void *info)
> +{
> +	struct write_info *winfo = info;
> +	u64 reg;
> +
> +	reg = read_sysreg_s(A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
> +
> +	reg &= ~winfo->mask;
> +	reg |= NC_FIELD_PREP(winfo->mask, winfo->val);
> +
> +	enable_verify(&reg);
> +
> +	write_sysreg_s(reg, A64FX_SDPF_IMP_PF_STREAM_DETECT_CTRL_EL0);
> +}
> +
> +static int pfctl_write_mask(unsigned int cpu, u64 mask, u64 val)
> +{
> +	struct write_info info = {
> +		.mask	= mask,
> +		.val	= val,
> +	};
> +
> +	if (!NC_FIELD_FIT(mask, val))
> +		return -EINVAL;
> +
> +	smp_call_function_single(cpu, _pfctl_write_mask, &info, true);
> +
> +	return 0;
> +}
> +
> +static ssize_t
> +pfctl_enable_show(struct device *pfctl_dev, struct device_attribute *attr,
> +		char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	bool val;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	val = (bool)pfctl_read_mask(cpu, aa->mask);
> +
> +	return sysfs_emit(buf, "%d\n", val ? 0 : 1);
> +}
> +
> +static ssize_t
> +pfctl_enable_store(struct device *pfctl_dev, struct device_attribute *attr,
> +		 const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	bool val;
> +	int ret;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	if (strtobool(buf, &val) < 0)
> +		return -EINVAL;
> +
> +	ret = pfctl_write_mask(cpu, aa->mask, val ? 0 : 1);
> +	if (ret < 0)
> +		return ret;
> +
> +	return size;
> +}
> +
> +#define A64FX_PFCTL_ENABLE_ATTR(_level, _mask)				\
> +	struct a64fx_pfctl_attr attr_l##_level##_enable = {		\
> +		.attr = __ATTR(stream_detect_prefetcher_enable, 0600,	\
> +			       pfctl_enable_show, pfctl_enable_store),	\
> +		.mask = _mask, }
> +
> +
> +static A64FX_PFCTL_ENABLE_ATTR(1, A64FX_SDPF_L1PF_DIS);
> +static A64FX_PFCTL_ENABLE_ATTR(2, A64FX_SDPF_L2PF_DIS);
> +
> +static ssize_t
> +pfctl_dist_show(struct device *pfctl_dev, struct device_attribute *attr,
> +		char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 val, min_dist;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +	min_dist = (u64)aa->data;
> +
> +	val = pfctl_read_mask(cpu, aa->mask);
> +
> +	if (val == PFCTL_DIST_AUTO_VAL)
> +		return sysfs_emit(buf, "%s\n", dist_auto_string);
> +	else
> +		return sysfs_emit(buf, "%llu\n", val * min_dist);
> +}
> +
> +static ssize_t
> +pfctl_dist_store(struct device *pfctl_dev, struct device_attribute *attr,
> +		 const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 min_dist, val;
> +	int ret;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +	min_dist = (u64)aa->data;
> +
> +	if (sysfs_streq(buf, dist_auto_string)) {
> +		val = PFCTL_DIST_AUTO_VAL;
> +	} else {
> +		ret = kstrtoull(buf, 10, &val);
> +		if (ret < 0 || val < 1)
> +			return -EINVAL;
> +	}
> +
> +	val = roundup(val, min_dist) / min_dist;
> +
> +	if (!NC_FIELD_FIT(aa->mask, val))
> +		return -EINVAL;
> +
> +	ret = pfctl_write_mask(cpu, aa->mask, val);
> +	if (ret < 0)
> +		return ret;
> +
> +	return size;
> +}
> +
> +#define PFCTL_DIST_ATTR(_level, _mask, _min_dist)			\
> +	struct a64fx_pfctl_attr attr_l##_level##_dist = {		\
> +		.attr = __ATTR(stream_detect_prefetcher_dist, 0600,	\
> +			       pfctl_dist_show, pfctl_dist_store),	\
> +		.mask = _mask,						\
> +		.data = (void *)(u64)_min_dist, }
> +
> +static PFCTL_DIST_ATTR(1, A64FX_SDPF_L1_DIST, PFCTL_MIN_L1_DIST);
> +static PFCTL_DIST_ATTR(2, A64FX_SDPF_L2_DIST, PFCTL_MIN_L2_DIST);
> +
> +static ssize_t
> +pfctl_strength_show(struct device *pfctl_dev, struct device_attribute *attr,
> +		    char *buf)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 val;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	val = pfctl_read_mask(cpu, aa->mask);
> +
> +	switch (val) {
> +	case PFCTL_STRONG_VAL:
> +		return sysfs_emit(buf, "%s\n", strength_strong_string);
> +	case PFCTL_WEAK_VAL:
> +		return sysfs_emit(buf, "%s\n", strength_weak_string);
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +static ssize_t
> +pfctl_strength_store(struct device *pfctl_dev, struct device_attribute *attr,
> +		     const char *buf, size_t size)
> +{
> +	unsigned int cpu = pfctl_dev_get_cpu(pfctl_dev);
> +	struct a64fx_pfctl_attr *aa;
> +	u64 val;
> +	int ret;
> +
> +	aa = container_of(attr, struct a64fx_pfctl_attr, attr);
> +
> +	if (sysfs_streq(buf, strength_strong_string))
> +		val = PFCTL_STRONG_VAL;
> +	else if (sysfs_streq(buf, strength_weak_string))
> +		val = PFCTL_WEAK_VAL;
> +	else
> +		return -EINVAL;
> +
> +	ret = pfctl_write_mask(cpu, aa->mask, val);
> +	if (ret < 0)
> +		return ret;
> +
> +	return size;
> +}
> +
> +#define PFCTL_STRENGTH_ATTR(_level, _mask)				\
> +	struct a64fx_pfctl_attr attr_l##_level##_strength = {		\
> +		.attr = __ATTR(stream_detect_prefetcher_strength, 0600, \
> +			       pfctl_strength_show,			\
> +			       pfctl_strength_store),			\
> +		.mask = _mask, }
> +
> +static PFCTL_STRENGTH_ATTR(1, A64FX_SDPF_L1W);
> +static PFCTL_STRENGTH_ATTR(2, A64FX_SDPF_L2W);
> +
> +static ssize_t
> +pfctl_strength_available_show(struct device *dev, struct device_attribute *attr,
> +			      char *buf)
> +{
> +	return sysfs_emit(buf, "%s %s\n", strength_strong_string,
> +			  strength_weak_string);
> +}
> +
> +/*
> + * A64FX has same kind of available strength for any caches, so define only one
> + * attribute.
> + */
> +struct a64fx_pfctl_attr attr_strength_available = {
> +	.attr = __ATTR(stream_detect_prefetcher_strength_available, 0400,
> +		       pfctl_strength_available_show, NULL), };
> +
> +static struct device_attribute *l1_attrs[] __ro_after_init = {
> +	&attr_l1_enable.attr,
> +	&attr_l1_dist.attr,
> +	&attr_l1_strength.attr,
> +	&attr_strength_available.attr,
> +	NULL,
> +};
> +
> +static struct device_attribute *l2_attrs[] __ro_after_init = {
> +	&attr_l2_enable.attr,
> +	&attr_l2_dist.attr,
> +	&attr_l2_strength.attr,
> +	&attr_strength_available.attr,
> +	NULL,
> +};
> +
> +static const struct pfctl_group a64fx_pfctl_groups[] = {
> +	{
> +		.level = 1,
> +		.type = CACHE_TYPE_DATA,
> +		.attrs = l1_attrs,
> +	},
> +	{
> +		.level = 2,
> +		.type = CACHE_TYPE_UNIFIED,
> +		.attrs = l2_attrs,
> +	},
> +	{
> +		.attrs = NULL,
> +	},
> +};
> +
> +/*
> + * This driver returns a negative value if it does not support the Hardware
> + * Prefetch Control or if it is running on a VM guest.
> + */
> +static int __init a64fx_pfctl_init(void)
> +{
> +	unsigned long implementor = read_cpuid_implementor();
> +	unsigned long part_number = read_cpuid_part_number();
> +
> +	if (!is_kernel_in_hyp_mode())
> +		return -EINVAL;
> +
> +	if ((implementor != ARM_CPU_IMP_FUJITSU) ||
> +	    (part_number != FUJITSU_CPU_PART_A64FX))
> +		return -ENODEV;
> +
> +	return pfctl_register_attrs(a64fx_pfctl_groups);
> +}
> +
> +static void __exit a64fx_pfctl_exit(void)
> +{
> +	return pfctl_unregister_attrs(a64fx_pfctl_groups);
> +}
> +
> +late_initcall(a64fx_pfctl_init);
> +module_exit(a64fx_pfctl_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("FUJITSU LIMITED");
> +MODULE_DESCRIPTION("A64FX Hardware Prefetch Control Driver");
> -- 
> 2.27.0
> 

How will this driver be properly loaded by the system if the hardware is
present?

thanks,

greg k-h

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 1/8] drivers: base: Add hardware prefetch control core driver
  2022-05-18  7:09     ` Greg KH
@ 2022-05-18 12:38       ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-18 12:38 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

Thanks for the comment.

> This feels wrong, attributes should be groups and be automatically added and
> removed by the driver core that way.  Not as lists of attributes like this, as that
> will race and be wrong.
> 
> Use a list of attribute groups please.

> You just raced with userspace and lost :(
> 
> Please use attribute groups instead of manually adding files after the device is
> created and userspace is notified that it was present.
> 
> That also makes your clean up logic much simpler (i.e. none as the drive core did it
> for you already.)

> Attribute groups please.

I modify to use attribute groups in the next version.

> Why do you think this needs to be in the driver core?  Why isn't this just a normal
> cpu driver?
> 
> thanks,

Does this mean that creating only dedicated driver for a CPU (e.g.
arch/x86/kernel/cpu/x86-pfctl.c) without creating core driver?
If so, there is no problem. I remove the core driver and create only
the dedicated drivers for each CPU in the next version.

As a reference, I previously wanted to define it as a common sysfs I/F
for A64FX and x86, so I create a core driver to provide it. However,
as I proceed with the consideration, it became clear that these were
independent I/F specifications. Therefore, the core driver is not
currently needed.



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 1/8] drivers: base: Add hardware prefetch control core driver
@ 2022-05-18 12:38       ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-18 12:38 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

Thanks for the comment.

> This feels wrong, attributes should be groups and be automatically added and
> removed by the driver core that way.  Not as lists of attributes like this, as that
> will race and be wrong.
> 
> Use a list of attribute groups please.

> You just raced with userspace and lost :(
> 
> Please use attribute groups instead of manually adding files after the device is
> created and userspace is notified that it was present.
> 
> That also makes your clean up logic much simpler (i.e. none as the drive core did it
> for you already.)

> Attribute groups please.

I modify to use attribute groups in the next version.

> Why do you think this needs to be in the driver core?  Why isn't this just a normal
> cpu driver?
> 
> thanks,

Does this mean that creating only dedicated driver for a CPU (e.g.
arch/x86/kernel/cpu/x86-pfctl.c) without creating core driver?
If so, there is no problem. I remove the core driver and create only
the dedicated drivers for each CPU in the next version.

As a reference, I previously wanted to define it as a common sysfs I/F
for A64FX and x86, so I create a core driver to provide it. However,
as I proceed with the consideration, it became clear that these were
independent I/F specifications. Therefore, the core driver is not
currently needed.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86
  2022-05-18  6:30 ` Kohei Tarumizu
@ 2022-05-19  8:29   ` Hector Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Hector Martin @ 2022-05-19  8:29 UTC (permalink / raw)
  To: Kohei Tarumizu, catalin.marinas, will, tglx, mingo, bp,
	dave.hansen, x86, hpa, gregkh, rafael, mchehab+huawei, eugenis,
	tony.luck, pcc, peterz, marcos, conor.dooley, nicolas.ferre,
	linus.walleij, arnd, linux-kernel, linux-arm-kernel

On 18/05/2022 15.30, Kohei Tarumizu wrote:
> This patch series add sysfs interface to control CPU's hardware
> prefetch behavior for performance tuning from userspace for the
> processor A64FX and x86 (on supported CPU).
> 

[snip]

> In pattern A, a change of dist at L1 has a larger effect. On the other
> hand, in pattern B, the change of dist at L2 has a larger effect.
> As described above, the optimal dist combination depends on the
> characteristics of the application. Therefore, such a sysfs interface
> is useful for performance tuning.

If this is something to be tuned for specific applications, shouldn't it
be a prctl or similar and part of process context, so different
applications can use different settings (or even a single application
depending on what it's doing)? Especially if writing those sysregs/MSRs
is cheap.

In particular, configuring things separately for different cores feels
strange. You'd then have to pin applications to specific cores to get
the benefits, and wouldn't be able to optimize for multiple applications
running simultaneously that need different prefetch behavior if they
share cores.

-- 
Hector Martin (marcan@marcan.st)
Public Key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86
@ 2022-05-19  8:29   ` Hector Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Hector Martin @ 2022-05-19  8:29 UTC (permalink / raw)
  To: Kohei Tarumizu, catalin.marinas, will, tglx, mingo, bp,
	dave.hansen, x86, hpa, gregkh, rafael, mchehab+huawei, eugenis,
	tony.luck, pcc, peterz, marcos, conor.dooley, nicolas.ferre,
	linus.walleij, arnd, linux-kernel, linux-arm-kernel

On 18/05/2022 15.30, Kohei Tarumizu wrote:
> This patch series add sysfs interface to control CPU's hardware
> prefetch behavior for performance tuning from userspace for the
> processor A64FX and x86 (on supported CPU).
> 

[snip]

> In pattern A, a change of dist at L1 has a larger effect. On the other
> hand, in pattern B, the change of dist at L2 has a larger effect.
> As described above, the optimal dist combination depends on the
> characteristics of the application. Therefore, such a sysfs interface
> is useful for performance tuning.

If this is something to be tuned for specific applications, shouldn't it
be a prctl or similar and part of process context, so different
applications can use different settings (or even a single application
depending on what it's doing)? Especially if writing those sysregs/MSRs
is cheap.

In particular, configuring things separately for different cores feels
strange. You'd then have to pin applications to specific cores to get
the benefits, and wouldn't be able to optimize for multiple applications
running simultaneously that need different prefetch behavior if they
share cores.

-- 
Hector Martin (marcan@marcan.st)
Public Key: https://mrcn.st/pub

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
  2022-05-18  6:43     ` Greg KH
@ 2022-05-20  6:30       ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:30 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

> As much fun as it is to see a function like this, that is not ok, and never guaranteed
> to keep working, sorry.  Please find the device properly, never assume a specific
> driver model topology as they are guaranteed to break over time and never
> supposed to be static like this.

I will consider a method that does not depend on a specific driver
model topology in the next version.

> How do you know attributes are to be marked read only after init?  Do other parts
> of the kernel do that?  I don't think the driver core guarantees that at all.

I will fix a proper marking in the next version.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
@ 2022-05-20  6:30       ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:30 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

> As much fun as it is to see a function like this, that is not ok, and never guaranteed
> to keep working, sorry.  Please find the device properly, never assume a specific
> driver model topology as they are guaranteed to break over time and never
> supposed to be static like this.

I will consider a method that does not depend on a specific driver
model topology in the next version.

> How do you know attributes are to be marked read only after init?  Do other parts
> of the kernel do that?  I don't think the driver core guarantees that at all.

I will fix a proper marking in the next version.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver
  2022-05-18  6:43     ` Greg KH
@ 2022-05-20  6:35       ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:35 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

> You need a lot more text here about what this is, what it does, why you would want
> it, and the module name it will create.

I add more text in the next version.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver
@ 2022-05-20  6:35       ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:35 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

> You need a lot more text here about what this is, what it does, why you would want
> it, and the module name it will create.

I add more text in the next version.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
  2022-05-18  6:44     ` Greg KH
@ 2022-05-20  6:40       ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:40 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

> How will the driver be automatically loaded if this is the case or not?
> You can not rely on userspace knowing to load a driver on its own, please tie into
> the proper device discovery.

I got it. I will consider how the driver loads properly.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 6/8] x86: Add hardware prefetch control support for x86
@ 2022-05-20  6:40       ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:40 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

> How will the driver be automatically loaded if this is the case or not?
> You can not rely on userspace knowing to load a driver on its own, please tie into
> the proper device discovery.

I got it. I will consider how the driver loads properly.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 2/8] drivers: base: Add Kconfig/Makefile to build hardware prefetch control core driver
  2022-05-18  7:04     ` Greg KH
@ 2022-05-20  6:42       ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:42 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

> Shouldn't this have "GENERIC" in the name liks the other generic implementations
> in this directory?

> But this is not a driver, it's a core function that a driver uses.  On its own, this
> code does nothing.

I consider removing the core driver in the next version. If I keep the
core driver, fix the name and description.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 2/8] drivers: base: Add Kconfig/Makefile to build hardware prefetch control core driver
@ 2022-05-20  6:42       ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  6:42 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

> Shouldn't this have "GENERIC" in the name liks the other generic implementations
> in this directory?

> But this is not a driver, it's a core function that a driver uses.  On its own, this
> code does nothing.

I consider removing the core driver in the next version. If I keep the
core driver, fix the name and description.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
  2022-05-18  7:09     ` Greg KH
@ 2022-05-20  7:00       ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  7:00 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

> Please do not put #if in .c files.

> Same here, no #if is needed if you do it properly in your .h file.

I will try not to put #if in .c files and do it the proper way.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control
@ 2022-05-20  7:00       ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  7:00 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

> Please do not put #if in .c files.

> Same here, no #if is needed if you do it properly in your .h file.

I will try not to put #if in .c files and do it the proper way.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 3/8] soc: fujitsu: Add hardware prefetch control support for A64FX
  2022-05-18  7:10     ` Greg KH
@ 2022-05-20  7:06       ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  7:06 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz, marcos,
	conor.dooley, nicolas.ferre, marcan, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

> How will this driver be properly loaded by the system if the hardware is
> present?

I will consider how the driver loads properly.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 3/8] soc: fujitsu: Add hardware prefetch control support for A64FX
@ 2022-05-20  7:06       ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  7:06 UTC (permalink / raw)
  To: 'Greg KH'
  Cc: rafael, peterz, catalin.marinas, linus.walleij, dave.hansen,
	conor.dooley, hpa, will, mchehab+huawei, x86, mingo, eugenis,
	arnd, bp, tglx, pcc, linux-arm-kernel, marcos, tony.luck, marcan,
	linux-kernel

> How will this driver be properly loaded by the system if the hardware is
> present?

I will consider how the driver loads properly.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86
  2022-05-19  8:29   ` Hector Martin
@ 2022-05-20  8:31     ` tarumizu.kohei
  -1 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  8:31 UTC (permalink / raw)
  To: 'Hector Martin',
	catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

Thanks for the comment.

> If this is something to be tuned for specific applications, shouldn't it be a prctl or
> similar and part of process context, so different applications can use different
> settings (or even a single application depending on what it's doing)? Especially if
> writing those sysregs/MSRs is cheap.

> In particular, configuring things separately for different cores feels strange. You'd
> then have to pin applications to specific cores to get the benefits, and wouldn't be
> able to optimize for multiple applications running simultaneously that need
> different prefetch behavior if they share cores.

As you say, this is used for tuning specific applications.

I assume that users using this feature bind an application to a specific
core and use it exclusively. This is not only for pfctl, but also to
prevent performance from being affected by context switches, etc.

I agree that it is also useful to be able to control in the process
context. However, in this case, I think that it is sufficient if it can
be provided as a userspace interface which expresses the hardware
prefetch register directly, assuming the above usage.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86
@ 2022-05-20  8:31     ` tarumizu.kohei
  0 siblings, 0 replies; 50+ messages in thread
From: tarumizu.kohei @ 2022-05-20  8:31 UTC (permalink / raw)
  To: 'Hector Martin',
	catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	gregkh, rafael, mchehab+huawei, eugenis, tony.luck, pcc, peterz,
	marcos, conor.dooley, nicolas.ferre, linus.walleij, arnd,
	linux-kernel, linux-arm-kernel

Thanks for the comment.

> If this is something to be tuned for specific applications, shouldn't it be a prctl or
> similar and part of process context, so different applications can use different
> settings (or even a single application depending on what it's doing)? Especially if
> writing those sysregs/MSRs is cheap.

> In particular, configuring things separately for different cores feels strange. You'd
> then have to pin applications to specific cores to get the benefits, and wouldn't be
> able to optimize for multiple applications running simultaneously that need
> different prefetch behavior if they share cores.

As you say, this is used for tuning specific applications.

I assume that users using this feature bind an application to a specific
core and use it exclusively. This is not only for pfctl, but also to
prevent performance from being affected by context switches, etc.

I agree that it is also useful to be able to control in the process
context. However, in this case, I think that it is sufficient if it can
be provided as a userspace interface which expresses the hardware
prefetch register directly, assuming the above usage.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2022-05-20  8:33 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-18  6:30 [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86 Kohei Tarumizu
2022-05-18  6:30 ` Kohei Tarumizu
2022-05-18  6:30 ` [PATCH v4 1/8] drivers: base: Add hardware prefetch control core driver Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-18  7:09   ` Greg KH
2022-05-18  7:09     ` Greg KH
2022-05-18 12:38     ` tarumizu.kohei
2022-05-18 12:38       ` tarumizu.kohei
2022-05-18  6:30 ` [PATCH v4 2/8] drivers: base: Add Kconfig/Makefile to build " Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-18  7:04   ` Greg KH
2022-05-18  7:04     ` Greg KH
2022-05-20  6:42     ` tarumizu.kohei
2022-05-20  6:42       ` tarumizu.kohei
2022-05-18  6:30 ` [PATCH v4 3/8] soc: fujitsu: Add hardware prefetch control support for A64FX Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-18  7:10   ` Greg KH
2022-05-18  7:10     ` Greg KH
2022-05-20  7:06     ` tarumizu.kohei
2022-05-20  7:06       ` tarumizu.kohei
2022-05-18  6:30 ` [PATCH v4 4/8] soc: fujitsu: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-18  6:30 ` [PATCH v4 5/8] arm64: Create cache sysfs directory without ACPI PPTT for hardware prefetch control Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-18  7:09   ` Greg KH
2022-05-18  7:09     ` Greg KH
2022-05-20  7:00     ` tarumizu.kohei
2022-05-20  7:00       ` tarumizu.kohei
2022-05-18  6:30 ` [PATCH v4 6/8] x86: Add hardware prefetch control support for x86 Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-18  6:43   ` Greg KH
2022-05-18  6:43     ` Greg KH
2022-05-20  6:30     ` tarumizu.kohei
2022-05-20  6:30       ` tarumizu.kohei
2022-05-18  6:44   ` Greg KH
2022-05-18  6:44     ` Greg KH
2022-05-20  6:40     ` tarumizu.kohei
2022-05-20  6:40       ` tarumizu.kohei
2022-05-18  6:30 ` [PATCH v4 7/8] x86: Add Kconfig/Makefile to build hardware prefetch control driver Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-18  6:43   ` Greg KH
2022-05-18  6:43     ` Greg KH
2022-05-20  6:35     ` tarumizu.kohei
2022-05-20  6:35       ` tarumizu.kohei
2022-05-18  6:30 ` [PATCH v4 8/8] docs: ABI: Add sysfs documentation interface of " Kohei Tarumizu
2022-05-18  6:30   ` Kohei Tarumizu
2022-05-19  8:29 ` [PATCH v4 0/8] Add hardware prefetch control driver for A64FX and x86 Hector Martin
2022-05-19  8:29   ` Hector Martin
2022-05-20  8:31   ` tarumizu.kohei
2022-05-20  8:31     ` tarumizu.kohei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.