All of lore.kernel.org
 help / color / mirror / Atom feed
From: alexander.levin@verizon.com
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	"Arnaldo Carvalho de Melo" <acme@kernel.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	"Jiri Olsa" <jolsa@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Stephane Eranian <eranian@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vince Weaver <vince@deater.net>,
	Vince Weaver <vincent.weaver@maine.edu>,
	Ingo Molnar <mingo@kernel.org>,
	alexander.levin@verizon.com
Subject: [PATCH AUTOSEL for 4.9 27/54] perf/x86/intel: Account interrupts for PEBS errors
Date: Wed, 22 Nov 2017 22:23:57 +0000	[thread overview]
Message-ID: <20171122222344.19782-27-alexander.levin@verizon.com> (raw)
In-Reply-To: <20171122222344.19782-1-alexander.levin@verizon.com>

From: Jiri Olsa <jolsa@kernel.org>

[ Upstream commit 475113d937adfd150eb82b5e2c5507125a68e7af ]

It's possible to set up PEBS events to get only errors and not
any data, like on SNB-X (model 45) and IVB-EP (model 62)
via 2 perf commands running simultaneously:

    taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10

This leads to a soft lock up, because the error path of the
intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt
for error PEBS interrupts, so in case you're getting ONLY
errors you don't have a way to stop the event when it's over
the max_samples_per_tick limit:

  NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816]
  ...
  RIP: 0010:[<ffffffff81159232>]  [<ffffffff81159232>] smp_call_function_single+0xe2/0x140
  ...
  Call Trace:
   ? trace_hardirqs_on_caller+0xf5/0x1b0
   ? perf_cgroup_attach+0x70/0x70
   perf_install_in_context+0x199/0x1b0
   ? ctx_resched+0x90/0x90
   SYSC_perf_event_open+0x641/0xf90
   SyS_perf_event_open+0x9/0x10
   do_syscall_64+0x6c/0x1f0
   entry_SYSCALL64_slow_path+0x25/0x25

Add perf_event_account_interrupt() which does the interrupt
and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s
error path.

We keep the pending_kill and pending_wakeup logic only in the
__perf_event_overflow() path, because they make sense only if
there's any data to deliver.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1482931866-6018-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
---
 arch/x86/events/intel/ds.c |  6 +++++-
 include/linux/perf_event.h |  1 +
 kernel/events/core.c       | 47 ++++++++++++++++++++++++++++++----------------
 3 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index be202390bbd3..9dfeeeca0ea8 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1389,9 +1389,13 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
 			continue;
 
 		/* log dropped samples number */
-		if (error[bit])
+		if (error[bit]) {
 			perf_log_lost_samples(event, error[bit]);
 
+			if (perf_event_account_interrupt(event))
+				x86_pmu_stop(event, 0);
+		}
+
 		if (counts[bit]) {
 			__intel_pmu_pebs_event(event, iregs, base,
 					       top, bit, counts[bit]);
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 4741ecdb9817..78ed8105e64d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1259,6 +1259,7 @@ extern void perf_event_disable(struct perf_event *event);
 extern void perf_event_disable_local(struct perf_event *event);
 extern void perf_event_disable_inatomic(struct perf_event *event);
 extern void perf_event_task_tick(void);
+extern int perf_event_account_interrupt(struct perf_event *event);
 #else /* !CONFIG_PERF_EVENTS: */
 static inline void *
 perf_aux_output_begin(struct perf_output_handle *handle,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 36ff2d93f222..13b9784427b0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7088,25 +7088,12 @@ static void perf_log_itrace_start(struct perf_event *event)
 	perf_output_end(&handle);
 }
 
-/*
- * Generic event overflow handling, sampling.
- */
-
-static int __perf_event_overflow(struct perf_event *event,
-				   int throttle, struct perf_sample_data *data,
-				   struct pt_regs *regs)
+static int
+__perf_event_account_interrupt(struct perf_event *event, int throttle)
 {
-	int events = atomic_read(&event->event_limit);
 	struct hw_perf_event *hwc = &event->hw;
-	u64 seq;
 	int ret = 0;
-
-	/*
-	 * Non-sampling counters might still use the PMI to fold short
-	 * hardware counters, ignore those.
-	 */
-	if (unlikely(!is_sampling_event(event)))
-		return 0;
+	u64 seq;
 
 	seq = __this_cpu_read(perf_throttled_seq);
 	if (seq != hwc->interrupts_seq) {
@@ -7134,6 +7121,34 @@ static int __perf_event_overflow(struct perf_event *event,
 			perf_adjust_period(event, delta, hwc->last_period, true);
 	}
 
+	return ret;
+}
+
+int perf_event_account_interrupt(struct perf_event *event)
+{
+	return __perf_event_account_interrupt(event, 1);
+}
+
+/*
+ * Generic event overflow handling, sampling.
+ */
+
+static int __perf_event_overflow(struct perf_event *event,
+				   int throttle, struct perf_sample_data *data,
+				   struct pt_regs *regs)
+{
+	int events = atomic_read(&event->event_limit);
+	int ret = 0;
+
+	/*
+	 * Non-sampling counters might still use the PMI to fold short
+	 * hardware counters, ignore those.
+	 */
+	if (unlikely(!is_sampling_event(event)))
+		return 0;
+
+	ret = __perf_event_account_interrupt(event, throttle);
+
 	/*
 	 * XXX event_limit might not quite work as expected on inherited
 	 * events
-- 
2.11.0

  parent reply	other threads:[~2017-11-22 23:20 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-22 22:23 [PATCH AUTOSEL for 4.9 01/54] dax: Avoid page invalidation races and unnecessary radix tree traversals alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 04/54] dmaengine: stm32-dma: Set correct args number for DMA request from DT alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 02/54] net/mlx4_en: Fix type mismatch for 32-bit systems alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 03/54] l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 06/54] usb: gadget: f_fs: Fix ExtCompat descriptor validation alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 07/54] libcxgb: fix error check for ip6_route_output() alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 08/54] net: systemport: Utilize skb_put_padto() alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 05/54] dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 11/54] ARM: OMAP1: DMA: Correct the number of logical channels alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 13/54] be2net: fix accesses to unicast list alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 12/54] vti6: fix device register to report IFLA_INFO_KIND alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 10/54] ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 09/54] net: systemport: Pad packet before inserting TSB alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 14/54] be2net: fix unicast list filling alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 15/54] net/appletalk: Fix kernel memory disclosure alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 19/54] mac80211: calculate min channel width correctly alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 18/54] mm: fix remote numa hits statistics alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 16/54] libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 17/54] net: qrtr: Mark 'buf' as little endian alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 21/54] nfs: Don't take a reference on fl->fl_file for LOCK operation alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 20/54] ravb: Remove Rx overflow log messages alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 22/54] drm/exynos/decon5433: update shadow registers iff there are active windows alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 23/54] drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 25/54] mac80211: prevent skb/txq mismatch alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 24/54] KVM: arm/arm64: Fix occasional warning from the timer work function alexander.levin
2017-11-22 22:23 ` alexander.levin [this message]
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 26/54] NFSv4: Fix client recovery when server reboots multiple times alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 28/54] powerpc/mm: Fix memory hotplug BUG() on radix alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 29/54] qla2xxx: Fix wrong IOCB type assumption alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 31/54] drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 30/54] drm/amdgpu: fix bug set incorrect value to vce register alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 35/54] mac80211: don't try to sleep in rate_control_rate_init() alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 32/54] net: sctp: fix array overrun read on sctp_timer_tbl alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 33/54] x86/fpu: Set the xcomp_bv when we fake up a XSAVES area alexander.levin
2017-11-22 22:23 ` [PATCH AUTOSEL for 4.9 34/54] drm/amdgpu: fix unload driver issue for virtual display alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 36/54] RDMA/qedr: Return success when not changing QP state alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 39/54] tipc: fix cleanup at module unload alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 37/54] RDMA/qedr: Fix RDMA CM loopback alexander.levin
2017-11-23  6:11   ` Amrani, Ram
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 38/54] tipc: fix nametbl_lock soft lockup at module exit alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 42/54] i2c: i2c-cadence: Initialize configuration before probing devices alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 44/54] gtp: clear DF bit on GTP packet tx alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 40/54] dmaengine: pl330: fix double lock alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 43/54] nvmet: cancel fatal error and flush async work before free controller alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 41/54] tcp: correct memory barrier usage in tcp_check_space() alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 47/54] net: thunderx: avoid dereferencing xcv when NULL alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 48/54] be2net: fix initial MAC setting alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 45/54] gtp: fix cross netns recv on gtp socket alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 46/54] net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 49/54] vfio/spapr: Fix missing mutex unlock when creating a window alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 50/54] mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 52/54] [media] cec: initiator should be the same as the destination for, poll alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 51/54] xen-netfront: Improve error handling during initialization alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 53/54] xen-netback: vif counters from int/long to u64 alexander.levin
2017-11-22 22:24 ` [PATCH AUTOSEL for 4.9 54/54] net: fec: fix multicast filtering hardware setup alexander.levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171122222344.19782-27-alexander.levin@verizon.com \
    --to=alexander.levin@verizon.com \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vince@deater.net \
    --cc=vincent.weaver@maine.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.