linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Kim Phillips <kim.phillips@amd.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Arnaldo Carvalho de Melo" <acme@kernel.org>,
	x86@kernel.org, Ingo Molnar <mingo@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Jiri Olsa <jolsa@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Borislav Petkov" <bp@alien8.de>,
	Stephane Eranian <eranian@google.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	"Namhyung Kim" <namhyung@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 5.2 72/94] perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
Date: Wed,  4 Sep 2019 11:57:17 -0400	[thread overview]
Message-ID: <20190904155739.2816-72-sashal@kernel.org> (raw)
In-Reply-To: <20190904155739.2816-1-sashal@kernel.org>

From: Kim Phillips <kim.phillips@amd.com>

[ Upstream commit 0f4cd769c410e2285a4e9873a684d90423f03090 ]

When counting dispatched micro-ops with cnt_ctl=1, in order to prevent
sample bias, IBS hardware preloads the least significant 7 bits of
current count (IbsOpCurCnt) with random values, such that, after the
interrupt is handled and counting resumes, the next sample taken
will be slightly perturbed.

The current count bitfield is in the IBS execution control h/w register,
alongside the maximum count field.

Currently, the IBS driver writes that register with the maximum count,
leaving zeroes to fill the current count field, thereby overwriting
the random bits the hardware preloaded for itself.

Fix the driver to actually retain and carry those random bits from the
read of the IBS control register, through to its write, instead of
overwriting the lower current count bits with zeroes.

Tested with:

perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload>

'perf annotate' output before:

 15.70  65:   addsd     %xmm0,%xmm1
 17.30        add       $0x1,%rax
 15.88        cmp       %rdx,%rax
              je        82
 17.32  72:   test      $0x1,%al
              jne       7c
  7.52        movapd    %xmm1,%xmm0
  5.90        jmp       65
  8.23  7c:   sqrtsd    %xmm1,%xmm0
 12.15        jmp       65

'perf annotate' output after:

 16.63  65:   addsd     %xmm0,%xmm1
 16.82        add       $0x1,%rax
 16.81        cmp       %rdx,%rax
              je        82
 16.69  72:   test      $0x1,%al
              jne       7c
  8.30        movapd    %xmm1,%xmm0
  8.13        jmp       65
  8.24  7c:   sqrtsd    %xmm1,%xmm0
  8.39        jmp       65

Tested on Family 15h and 17h machines.

Machines prior to family 10h Rev. C don't have the RDWROPCNT capability,
and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't
affect their operation.

It is unknown why commit db98c5faf8cb ("perf/x86: Implement 64-bit
counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt
field; the number of preloaded random bits has always been 7, AFAICT.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: "Namhyung Kim" <namhyung@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/events/amd/ibs.c         | 13 ++++++++++---
 arch/x86/include/asm/perf_event.h | 12 ++++++++----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 62f317c9113af..5b35b7ea5d728 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -661,10 +661,17 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
 
 	throttle = perf_event_overflow(event, &data, &regs);
 out:
-	if (throttle)
+	if (throttle) {
 		perf_ibs_stop(event, 0);
-	else
-		perf_ibs_enable_event(perf_ibs, hwc, period >> 4);
+	} else {
+		period >>= 4;
+
+		if ((ibs_caps & IBS_CAPS_RDWROPCNT) &&
+		    (*config & IBS_OP_CNT_CTL))
+			period |= *config & IBS_OP_CUR_CNT_RAND;
+
+		perf_ibs_enable_event(perf_ibs, hwc, period);
+	}
 
 	perf_event_update_userpage(event);
 
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 1392d5e6e8d67..ee26e9215f187 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -252,16 +252,20 @@ struct pebs_lbr {
 #define IBSCTL_LVT_OFFSET_VALID		(1ULL<<8)
 #define IBSCTL_LVT_OFFSET_MASK		0x0F
 
-/* ibs fetch bits/masks */
+/* IBS fetch bits/masks */
 #define IBS_FETCH_RAND_EN	(1ULL<<57)
 #define IBS_FETCH_VAL		(1ULL<<49)
 #define IBS_FETCH_ENABLE	(1ULL<<48)
 #define IBS_FETCH_CNT		0xFFFF0000ULL
 #define IBS_FETCH_MAX_CNT	0x0000FFFFULL
 
-/* ibs op bits/masks */
-/* lower 4 bits of the current count are ignored: */
-#define IBS_OP_CUR_CNT		(0xFFFF0ULL<<32)
+/*
+ * IBS op bits/masks
+ * The lower 7 bits of the current count are random bits
+ * preloaded by hardware and ignored in software
+ */
+#define IBS_OP_CUR_CNT		(0xFFF80ULL<<32)
+#define IBS_OP_CUR_CNT_RAND	(0x0007FULL<<32)
 #define IBS_OP_CNT_CTL		(1ULL<<19)
 #define IBS_OP_VAL		(1ULL<<18)
 #define IBS_OP_ENABLE		(1ULL<<17)
-- 
2.20.1


  parent reply	other threads:[~2019-09-04 15:59 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04 15:56 [PATCH AUTOSEL 5.2 01/94] ieee802154: hwsim: Fix error handle path in hwsim_init_module Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 02/94] ieee802154: hwsim: unregister hw while hwsim_subscribe_all_others fails Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 03/94] ARM: dts: am57xx: Disable voltage switching for SD card Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 04/94] ARM: OMAP2+: Fix missing SYSC_HAS_RESET_STATUS for dra7 epwmss Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 05/94] bus: ti-sysc: Fix handling of forced idle Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 06/94] bus: ti-sysc: Fix using configured sysc mask value Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 07/94] ARM: dts: Fix flags for gpio7 Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 08/94] ARM: dts: Fix incorrect dcan register mapping for am3, am4 and dra7 Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 09/94] arm64: dts: meson-g12a: add missing dwc2 phy-names Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 10/94] s390/bpf: fix lcgr instruction encoding Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 11/94] ARM: OMAP2+: Fix omap4 errata warning on other SoCs Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 12/94] ARM: dts: am335x: Fix UARTs length Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 13/94] ARM: dts: dra74x: Fix iodelay configuration for mmc3 Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 14/94] ARM: OMAP1: ams-delta-fiq: Fix missing irq_ack Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 15/94] bus: ti-sysc: Simplify cleanup upon failures in sysc_probe() Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 16/94] ARM: dts: Fix incomplete dts data for am3 and am4 mmc Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 17/94] s390/bpf: use 32-bit index for tail calls Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 18/94] batman-adv: fix uninit-value in batadv_netlink_get_ifindex() Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 19/94] selftests/bpf: fix "bind{4, 6} deny specific IP & port" on s390 Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 20/94] tools: bpftool: close prog FD before exit on showing a single program Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 21/94] fpga: altera-ps-spi: Fix getting of optional confd gpio Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 22/94] netfilter: ebtables: Fix argument order to ADD_COUNTER Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 23/94] netfilter: nft_flow_offload: missing netlink attribute policy Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 24/94] netfilter: xt_nfacct: Fix alignment mismatch in xt_nfacct_match_info Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 25/94] NFSv4: Fix return values for nfs4_file_open() Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 26/94] NFSv4: Fix return value in nfs_finish_open() Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 27/94] NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 28/94] NFS: On fatal writeback errors, we need to call nfs_inode_remove_request() Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 29/94] Kconfig: Fix the reference to the IDT77105 Phy driver in the description of ATM_NICSTAR_USE_IDT77105 Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 30/94] xdp: unpin xdp umem pages in error path Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 31/94] selftests/bpf: fix test_cgroup_storage on s390 Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 32/94] selftests/bpf: add config fragment BPF_JIT Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 33/94] selftests/bpf: install files test_xdp_vlan.sh Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 34/94] qed: Add cleanup in qed_slowpath_start() Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 35/94] drm/omap: Fix port lookup for SDI output Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 36/94] drm/virtio: use virtio_max_dma_size Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 37/94] ARM: 8874/1: mm: only adjust sections of valid mm structures Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 38/94] batman-adv: Only read OGM tvlv_len after buffer len check Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 39/94] batman-adv: Only read OGM2 " Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 40/94] flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 41/94] bpf: fix use after free in prog symbol exposure Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 42/94] bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 43/94] r8152: Set memory to all 0xFFs on failed reg reads Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 44/94] x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 45/94] SUNRPC: Handle EADDRINUSE and ENOBUFS correctly Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 46/94] SUNRPC: Handle connection breakages correctly in call_status() Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 47/94] pNFS/flexfiles: Don't time out requests on hard mounts Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 48/94] NFS: Fix spurious EIO read errors Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 49/94] NFS: Fix writepage(s) error handling to not report errors twice Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 50/94] drm/amdgpu: fix dma_fence_wait without reference Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 51/94] netfilter: xt_physdev: Fix spurious error message in physdev_mt_check Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 52/94] netfilter: nf_conntrack_ftp: Fix debug output Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 53/94] NFSv2: Fix eof handling Sasha Levin
2019-09-04 15:56 ` [PATCH AUTOSEL 5.2 54/94] NFSv2: Fix write regression Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 55/94] NFS: remove set but not used variable 'mapping' Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 56/94] kallsyms: Don't let kallsyms_lookup_size_offset() fail on retrieving the first symbol Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 57/94] netfilter: conntrack: make sysctls per-namespace again Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 58/94] drm/amd/powerplay: correct Vega20 dpm level related settings Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 59/94] cifs: set domainName when a domain-key is used in multiuser Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 60/94] cifs: Use kzfree() to zero out the password Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 61/94] Add genphy_c45_config_aneg() function to phy-c45.c Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 62/94] libceph: don't call crypto_free_sync_skcipher() on a NULL tfm Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 63/94] x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 64/94] usb: host: xhci-tegra: Set DMA mask correctly Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 65/94] RISC-V: Fix FIXMAP area corruption on RV32 systems Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 66/94] ARM: 8901/1: add a criteria for pfn_valid of arm Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 67/94] ibmvnic: Do not process reset during or after device removal Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 68/94] nfp: flower: handle neighbour events on internal ports Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 69/94] sky2: Disable MSI on yet another ASUS boards (P6Xxxx) Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 70/94] i2c: designware: Synchronize IRQs when unregistering slave client Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 71/94] perf/x86/intel: Restrict period on Nehalem Sasha Levin
2019-09-04 15:57 ` Sasha Levin [this message]
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 73/94] i2c: iproc: Stop advertising support of SMBUS quick cmd Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 74/94] i2c: mediatek: disable zero-length transfers for mt8183 Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 75/94] amd-xgbe: Fix error path in xgbe_mod_init() Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 76/94] net: stmmac: dwmac-rk: Don't fail if phy regulator is absent Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 77/94] netfilter: nf_flow_table: fix offload for flows that are subject to xfrm Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 78/94] netfilter: nf_flow_table: clear skb tstamp before xmit Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 79/94] tools/power x86_energy_perf_policy: Fix "uninitialized variable" warnings at -O2 Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 80/94] tools/power x86_energy_perf_policy: Fix argument parsing Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 81/94] tools/power turbostat: fix leak of file descriptor on error return path Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 82/94] tools/power turbostat: fix file descriptor leaks Sasha Levin
2019-09-05 17:00   ` Brown, Len
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 83/94] tools/power turbostat: fix buffer overrun Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 84/94] tools/power turbostat: Fix Haswell Core systems Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 85/94] tools/power turbostat: Add Ice Lake NNPI support Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 86/94] tools/power turbostat: Fix CPU%C1 display value Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 87/94] net: aquantia: fix removal of vlan 0 Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 88/94] net: aquantia: fix limit of vlan filters Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 89/94] net: aquantia: reapply vlan filters on up Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 90/94] net: aquantia: linkstate irq should be oneshot Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 91/94] net: aquantia: fix out of memory condition on rx side Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 92/94] net: dsa: microchip: add KSZ8563 compatibility string Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 93/94] enetc: Add missing call to 'pci_free_irq_vectors()' in probe and remove functions Sasha Levin
2019-09-04 15:57 ` [PATCH AUTOSEL 5.2 94/94] net: seeq: Fix the function used to release some memory in an error handling path Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190904155739.2816-72-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=eranian@google.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=kim.phillips@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).