linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "tarumizu.kohei@fujitsu.com" <tarumizu.kohei@fujitsu.com>
To: 'Borislav Petkov' <bp@alien8.de>
Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [RFC PATCH v2 0/5] Add hardware prefetch driver for A64FX and Intel processors
Date: Mon, 8 Nov 2021 02:17:43 +0000	[thread overview]
Message-ID: <OSBPR01MB20370518F9296BA4302FF7DC80919@OSBPR01MB2037.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <YYP4fAgKSh4bVvgD@zn.tnic>

Hi,

Thanks for your comment.

> This is all fine and dandy but what I'm missing in this pile of text - at least I couldn't
> find it - is why do we need this in the upstream kernel?
> 
> Is there some real-life use case that would benefit from software fiddling with
> prefetchers or is this one of those, well, we have those controls, lets expose them
> in the OS?
> 
> IOW, you need to sell this stuff properly first - then talk design.

A64FX and some Intel processors has implementation-dependent register
for controlling hardware prefetch. Intel has MSR_MISC_FEATURE_CONTROL,
and A64FX has IMP_PF_STREAM_DETECT_CTRL_EL0. These register cannot be
accessed from userspace, so we provide a proper kernel interface.

The advantage of using this interface from userspace is that we can
expect performance improvements.

The following performance improvements have been reported for some
Intel processors.
https://github.com/xmrig/xmrig/issues/1433#issuecomment-572126184

A64FX also has several applications that have actually been improved
performance. In most of these cases, we are tuning the parameter of
hardware prefetch distance. One of them is the Stream benchmark.

For reference, here is the result of STREAM Triad when tuning with
the dist attribute file in L1 and L2 cache on A64FX.

| dist combination  | Pattern A   | Pattern B   |
|-------------------|-------------|-------------|
| L1:256,  L2:1024  | 234505.2144 | 114600.0801 |
| L1:1536, L2:1024  | 279172.8742 | 118979.4542 |
| L1:256,  L2:10240 | 247716.7757 | 127364.1533 |
| L1:1536, L2:10240 | 283675.6625 | 125950.6847 |

In pattern A, we set the size of the array to 174720, which is about
half the size of the L1d cache. In pattern B, we set the size of the
array to 10485120, which is about twice the size of the L2 cache.

In pattern A, a change of dist at L1 has a larger effect. On the other
hand, in pattern B, the change of dist at L2 has a larger effect.
As described above, the optimal dist combination depends on the
characteristics of the application. Therefore, such a sysfs interface
is useful for performance tuning.

For these reasons, we would like to add this interface to the
upstream kernel.

> I'm not sure about a wholly separate drivers/hwpf/ - it's not like there are
> gazillion different hw prefetch drivers.

We created a new directory to lump multiple separate files into one
place. We don't think this is a good way. If there is any other
suitable way, we would like to change it.

  reply	other threads:[~2021-11-08  2:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-04  5:21 [RFC PATCH v2 0/5] Add hardware prefetch driver for A64FX and Intel processors Kohei Tarumizu
2021-11-04  5:21 ` [RFC PATCH v2 1/5] driver: hwpf: Add hardware prefetch core driver register/unregister functions Kohei Tarumizu
2021-11-04  5:21 ` [RFC PATCH v2 2/5] driver: hwpf: Add support for A64FX to hardware prefetch driver Kohei Tarumizu
2021-11-04  5:21 ` [RFC PATCH v2 3/5] driver: hwpf: Add support for Intel " Kohei Tarumizu
2021-11-08  1:51   ` Dave Hansen
2021-11-09  9:44     ` tarumizu.kohei
2021-11-04  5:21 ` [RFC PATCH v2 4/5] driver: hwpf: Add Kconfig/Makefile to build " Kohei Tarumizu
2021-11-04  5:21 ` [RFC PATCH v2 5/5] docs: ABI: Add sysfs documentation interface of " Kohei Tarumizu
2021-11-04 14:55   ` Dave Hansen
2021-11-08  1:29     ` tarumizu.kohei
2021-11-08  1:49       ` Dave Hansen
2021-11-09  9:41         ` tarumizu.kohei
2021-11-09 17:44           ` Dave Hansen
2021-11-10  9:25             ` tarumizu.kohei
2021-11-04 15:13 ` [RFC PATCH v2 0/5] Add hardware prefetch driver for A64FX and Intel processors Borislav Petkov
2021-11-08  2:17   ` tarumizu.kohei [this message]
2021-11-10  8:34     ` Borislav Petkov
2021-11-18  6:14       ` tarumizu.kohei
2021-11-18  7:09         ` tarumizu.kohei
2021-12-06  9:30           ` tarumizu.kohei
2021-11-04 17:10 ` Peter Zijlstra
2021-11-08  2:36   ` tarumizu.kohei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OSBPR01MB20370518F9296BA4302FF7DC80919@OSBPR01MB2037.jpnprd01.prod.outlook.com \
    --to=tarumizu.kohei@fujitsu.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).