All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Don Zickus <dzickus@redhat.com>
Cc: dave.hansen@linux.intel.com, eranian@google.com,
	ak@linux.intel.com, jmario@redhat.com,
	linux-kernel@vger.kernel.org, acme@infradead.org,
	mingo@kernel.org
Subject: Re: [PATCH] perf, x86: Optimize intel_pmu_pebs_fixup_ip()
Date: Wed, 16 Oct 2013 15:31:25 +0200	[thread overview]
Message-ID: <20131016133125.GY10651@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20131016124649.GG227855@redhat.com>

On Wed, Oct 16, 2013 at 08:46:49AM -0400, Don Zickus wrote:
> On Wed, Oct 16, 2013 at 12:57:55PM +0200, Peter Zijlstra wrote:
> > A prettier patch below. The main difference is on-demand allocation of
> > the scratch buffer.
> 
> I'll see if I can sanity test this in the next couple hours.
> 
> Further testing yesterday showed that intel_pmu_drain_pebs_nhm still
> has long latencies somewhere.  With 15 minute reboots, isolation goes
> slooow.

Pick a smaller box? I seem to be able to reproduce on my wsm-ep, which
boots inside a minute :-)

root@westmere:~# cd /debug/tracing/
root@westmere:/debug/tracing# echo function > current_tracer
root@westmere:/debug/tracing# cat available_filter_functions | grep ^inat > set_ftrace_notrace
root@westmere:/debug/tracing# cat available_filter_functions | grep ^insn | grep -v get_length >> set_ftrace_notrace

Run: perf top --stdio -e 'cycles:pp' in another window and when the
console output shows:

[  610.319486] perf samples too long (19310 > 19230), lowering kernel.perf_event_max_sample_rate to 7000

quickly press enter here:

root@westmere:/debug/tracing# echo 0 > tracing_on
root@westmere:/debug/tracing# cat trace > ~/trace1
root@westmere:/debug/tracing# cat ~/trace1 | awk '/rcu_nmi_enter/ {
t=gensub(":", "", "g", $4); cpu=gensub("[][]", "", "g", $2);
start[cpu]=t; } /rcu_nmi_exit/ { x=gensub(":", "", "g", $4);
cpu=gensub("[][]", "", "g", $2); t=start[cpu]; printf "%6.6f -- starting
at: %6.6f on cpu: %d\n", x-t, t, cpu } ' | sort -n | tail -10
0.000037 -- starting at: 605.317795 on cpu: 9
0.000039 -- starting at: 602.831019 on cpu: 23
0.000039 -- starting at: 602.831148 on cpu: 6
0.000039 -- starting at: 602.955953 on cpu: 20
0.000040 -- starting at: 602.834012 on cpu: 18
0.000040 -- starting at: 602.956972 on cpu: 21
0.000040 -- starting at: 602.960048 on cpu: 22
0.000040 -- starting at: 609.290776 on cpu: 7
0.000075 -- starting at: 609.773875 on cpu: 0
0.009398 -- starting at: 610.319445 on cpu: 1
root@westmere:/debug/tracing# grep "\[001\]" ~/trace1 | awk 'BEGIN {p=0}
/610.319445/ {p=1} /rcu_nmi_exit/ {p=0} {if (p) print $0}'

Now obviously the whole printk stuff below is insane, but it does show
its one that triggered the check. And the trace does give a fair idea of
what its doing:

          <idle>-0     [001] d.h.   610.319445: rcu_nmi_enter <-do_nmi
          <idle>-0     [001] d.h.   610.319446: nmi_handle.isra.3 <-do_nmi
          <idle>-0     [001] d.h.   610.319447: intel_pmu_handle_irq <-perf_event_nmi_handler
          <idle>-0     [001] d.h.   610.319447: intel_pmu_disable_all <-intel_pmu_handle_irq
          <idle>-0     [001] d.h.   610.319448: intel_pmu_pebs_disable_all <-intel_pmu_disable_all
          <idle>-0     [001] d.h.   610.319448: intel_pmu_lbr_disable_all <-intel_pmu_disable_all
          <idle>-0     [001] d.h.   610.319449: intel_pmu_drain_bts_buffer <-intel_pmu_handle_irq
          <idle>-0     [001] d.h.   610.319449: intel_pmu_lbr_read <-intel_pmu_handle_irq
          <idle>-0     [001] d.h.   610.319453: intel_pmu_drain_pebs_nhm <-intel_pmu_handle_irq
          <idle>-0     [001] d.h.   610.319453: __intel_pmu_pebs_event <-intel_pmu_drain_pebs_nhm
          <idle>-0     [001] d.h.   610.319454: intel_pmu_save_and_restart <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319455: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319456: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319457: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319458: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319458: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319459: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319459: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319460: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319460: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319461: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319461: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319462: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319462: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319463: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319464: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319464: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319465: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319465: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319466: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319466: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319467: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319467: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319468: insn_get_length <-__intel_pmu_pebs_event
          <idle>-0     [001] d.h.   610.319469: task_tgid_nr_ns <-perf_event_pid
          <idle>-0     [001] d.h.   610.319469: __task_pid_nr_ns <-perf_event_tid
          <idle>-0     [001] d.h.   610.319470: perf_output_begin <-perf_log_throttle
          <idle>-0     [001] d.h.   610.319470: perf_output_copy <-perf_log_throttle
          <idle>-0     [001] d.h.   610.319470: perf_output_copy <-perf_event__output_id_sample
          <idle>-0     [001] d.h.   610.319471: perf_output_copy <-perf_event__output_id_sample
          <idle>-0     [001] d.h.   610.319472: perf_output_copy <-perf_event__output_id_sample
          <idle>-0     [001] d.h.   610.319472: perf_output_end <-perf_log_throttle
          <idle>-0     [001] d.h.   610.319472: perf_output_put_handle <-perf_output_end
          <idle>-0     [001] d.h.   610.319473: kvm_is_in_guest <-perf_misc_flags
          <idle>-0     [001] d.h.   610.319473: task_tgid_nr_ns <-perf_event_pid
          <idle>-0     [001] d.h.   610.319474: __task_pid_nr_ns <-perf_event_tid
          <idle>-0     [001] d.h.   610.319474: kvm_is_in_guest <-perf_instruction_pointer
          <idle>-0     [001] d.h.   610.319475: perf_output_begin <-__perf_event_overflow
          <idle>-0     [001] d.h.   610.319475: perf_output_copy <-perf_output_sample
          <idle>-0     [001] d.h.   610.319475: perf_output_copy <-perf_output_sample
          <idle>-0     [001] d.h.   610.319476: perf_output_copy <-perf_output_sample
          <idle>-0     [001] d.h.   610.319476: perf_output_copy <-perf_output_sample
          <idle>-0     [001] d.h.   610.319476: perf_output_copy <-perf_output_sample
          <idle>-0     [001] d.h.   610.319477: perf_output_copy <-perf_output_sample
          <idle>-0     [001] d.h.   610.319477: perf_output_end <-__perf_event_overflow
          <idle>-0     [001] d.h.   610.319477: perf_output_put_handle <-perf_output_end
          <idle>-0     [001] d.h.   610.319478: intel_pmu_disable_event <-x86_pmu_stop
          <idle>-0     [001] d.h.   610.319478: intel_pmu_lbr_disable <-intel_pmu_disable_event
          <idle>-0     [001] d.h.   610.319479: intel_pmu_pebs_disable <-intel_pmu_disable_event
          <idle>-0     [001] d.h.   610.319480: intel_pmu_enable_all <-intel_pmu_handle_irq
          <idle>-0     [001] d.h.   610.319480: intel_pmu_pebs_enable_all <-intel_pmu_enable_all
          <idle>-0     [001] d.h.   610.319480: intel_pmu_lbr_enable_all <-intel_pmu_enable_all
          <idle>-0     [001] d.h.   610.319481: _raw_spin_trylock <-___ratelimit
          <idle>-0     [001] d.h.   610.319482: _raw_spin_unlock_irqrestore <-___ratelimit
          <idle>-0     [001] d.h.   610.319482: printk <-perf_sample_event_took
          <idle>-0     [001] d.h.   610.319482: vprintk_emit <-printk

< snip ~8000 lines >

          <idle>-0     [001] d.h.   610.328841: wake_up_klogd <-console_unlock
          <idle>-0     [001] d.h.   610.328841: arch_irq_work_raise <-irq_work_queue
          <idle>-0     [001] d.h.   610.328842: apic_send_IPI_self <-arch_irq_work_raise
          <idle>-0     [001] d.h.   610.328842: native_apic_wait_icr_idle <-arch_irq_work_raise
          <idle>-0     [001] d.h.   610.328843: arch_trigger_all_cpu_backtrace_handler <-nmi_handle.isra.3

  reply	other threads:[~2013-10-16 13:31 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-14 20:35 x86, perf: throttling issues with long nmi latencies Don Zickus
2013-10-14 21:28 ` Andi Kleen
2013-10-15 10:14 ` Peter Zijlstra
2013-10-15 13:02   ` Peter Zijlstra
2013-10-15 14:32     ` Peter Zijlstra
2013-10-15 15:07       ` Peter Zijlstra
2013-10-15 15:41         ` Don Zickus
2013-10-16 10:57           ` [PATCH] perf, x86: Optimize intel_pmu_pebs_fixup_ip() Peter Zijlstra
2013-10-16 12:46             ` Don Zickus
2013-10-16 13:31               ` Peter Zijlstra [this message]
2013-10-16 13:54                 ` Don Zickus
2013-10-17 11:21                 ` Peter Zijlstra
2013-10-17 13:33                 ` Peter Zijlstra
2013-10-29 14:07                   ` [tip:perf/urgent] perf/x86: Fix NMI measurements tip-bot for Peter Zijlstra
2013-10-16 20:52             ` [PATCH] perf, x86: Optimize intel_pmu_pebs_fixup_ip() Andi Kleen
2013-10-16 21:03               ` Peter Zijlstra
2013-10-16 23:07                 ` Peter Zijlstra
2013-10-17  9:41                   ` Peter Zijlstra
2013-10-17 16:00                     ` Don Zickus
2013-10-17 16:04                       ` Don Zickus
2013-10-17 16:30                         ` Peter Zijlstra
2013-10-17 18:26                           ` Linus Torvalds
2013-10-17 21:08                             ` Peter Zijlstra
2013-10-17 21:11                               ` Peter Zijlstra
2013-10-17 22:01                             ` Peter Zijlstra
2013-10-17 22:27                               ` Linus Torvalds
2013-10-22 21:12                                 ` Peter Zijlstra
2013-10-23  7:09                                   ` Linus Torvalds
2013-10-23 20:48                                     ` Peter Zijlstra
2013-10-24 10:52                                       ` Peter Zijlstra
2013-10-24 13:47                                         ` Don Zickus
2013-10-24 14:06                                           ` Peter Zijlstra
2013-10-25 16:33                                         ` Don Zickus
2013-10-25 17:03                                           ` Peter Zijlstra
2013-10-26 10:36                                           ` Ingo Molnar
2013-10-28 13:19                                             ` Don Zickus
2013-10-29 14:08                                         ` [tip:perf/core] perf/x86: Further optimize copy_from_user_nmi() tip-bot for Peter Zijlstra
2013-10-23  7:44                                   ` [PATCH] perf, x86: Optimize intel_pmu_pebs_fixup_ip() Ingo Molnar
2013-10-17 14:49             ` Don Zickus
2013-10-17 14:51               ` Peter Zijlstra
2013-10-17 15:03                 ` Don Zickus
2013-10-17 15:09                   ` Peter Zijlstra
2013-10-17 15:11                     ` Peter Zijlstra
2013-10-17 16:50             ` [tip:perf/core] perf/x86: " tip-bot for Peter Zijlstra
2013-10-15 16:22         ` x86, perf: throttling issues with long nmi latencies Don Zickus
2013-10-15 14:36     ` Don Zickus
2013-10-15 14:39       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131016133125.GY10651@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dzickus@redhat.com \
    --cc=eranian@google.com \
    --cc=jmario@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.