BPF Archive on lore.kernel.org
 help / color / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Brendan Gregg <brendan.d.gregg@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	Daniel Borkmann <daniel@iogearbox.net>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	joel@joelfernandes.org,
	"Naveen N . Rao" <naveen.n.rao@linux.ibm.com>,
	Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Subject: Re: [RFT PATCH 04/13] kprobes: Make optimizer delay to 1 second
Date: Wed, 22 Jan 2020 18:26:47 -0800
Message-ID: <20200123022647.GO2935@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200123103334.6e1821625643d007297ecf94@kernel.org>

On Thu, Jan 23, 2020 at 10:33:34AM +0900, Masami Hiramatsu wrote:
> On Wed, 22 Jan 2020 08:54:32 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > On Wed, Jan 22, 2020 at 10:12:40PM +0900, Masami Hiramatsu wrote:
> > > On Wed, 22 Jan 2020 07:11:15 -0500
> > > Steven Rostedt <rostedt@goodmis.org> wrote:
> > > 
> > > > On Wed, 22 Jan 2020 16:23:17 +0900
> > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > 
> > > > > On Tue, 21 Jan 2020 19:29:05 -0500
> > > > > Steven Rostedt <rostedt@goodmis.org> wrote:
> > > > > 
> > > > > > On Thu, 16 Jan 2020 23:44:52 +0900
> > > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > > >   
> > > > > > > Since the 5 jiffies delay for the optimizer is too
> > > > > > > short to wait for other probes, make it longer,
> > > > > > > like 1 second.  
> > > > > > 
> > > > > > Hi Masami,
> > > > > > 
> > > > > > Can you explain more *why* 5 jiffies is too short.  
> > > > > 
> > > > > Yes, I had introduced this 5 jiffies delay for multiple probe registration
> > > > > and unregistration like systemtap, which will use array-based interface to
> > > > > register/unregister. In that case, 5 jiffies will be enough for the delay
> > > > > to wait for other kprobe registration/unregsitration.
> > > > > 
> > > > > However, since perf and ftrace register/unregister probes one-by-one with
> > > > > RCU synchronization interval, the optimizer will be started before
> > > > > finishing to register/unregister all probes.
> > > > > And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> > > > > Since the kprobe_mutex is also involved in disabling kprobes, this also
> > > > > stops probe-event disabling.
> > > > > 
> > > > > Maybe 5 jiffies is enough for adding/removing a few probe events, but
> > > > > not enough for dozens of probe events.
> > > > > 
> > > > 
> > > > Perhaps we should have a mechanism that can detect new probes being
> > > > added, and just continue to delay the optimization, instead of having
> > > > some arbitrary delay.
> > > 
> > > Yes, that is what [03/13] does :) 
> > > Anyway, it seems that the RCU-synchronization takes more than 5 jiffies.
> > > And in that case, [03/13] still doesn't work. That's why I added this patch
> > > after that.
> > 
> > If the RCU synchronization is synchronize_rcu_tasks(), then yes, it
> > will often take way more than 5 jiffies.  If it is synchronize_rcu(),
> > 5 jiffies would not be unusual, especially on larger systems.
> > But in the case of synchronize_rcu(), one option is to instead use
> > synchronize_rcu_expedited().  It is not clear that this last is really
> > justified in this case, but figured it might be worth mentioning.
> 
> It is synchronize_rcu(), but it seems sometimes it is called several
> times on one probe disabling.

OK, good to know.

> Anyway, without this update, I added a printk to count optimizer
> queue-length and found that all optimizer call with a single kprobe
> on the quenes. I think this depends on the machine, but as far as
> I tested on 8-threads qemu x86, shows this result...
> 
> Probes: 256 kprobe_events
> Enable events
> real	0m 0.02s
> user	0m 0.00s
> sys	0m 0.02s
> [   17.730548] Queue-update: 180, skipped, Total Queued: 180
> [   17.739445] Queue-update: 1, go, Total Queued: 180
> Disable events
> [   19.744634] Queue-update: 1, go, Total Queued: 1
> [   19.770743] Queue-update: 1, go, Total Queued: 1
> [   19.886625] Queue-update: 1, go, Total Queued: 1
> [   20.006631] Queue-update: 1, go, Total Queued: 1
> [   20.126628] Queue-update: 1, go, Total Queued: 1
> [   20.246620] Queue-update: 1, go, Total Queued: 1
> [   20.374665] Queue-update: 1, go, Total Queued: 1
> [   20.486617] Queue-update: 1, go, Total Queued: 1
> [   20.606608] Queue-update: 1, go, Total Queued: 1
> [   20.726596] Queue-update: 1, go, Total Queued: 1
> [   20.846608] Queue-update: 1, go, Total Queued: 1
> [   20.966723] Queue-update: 1, go, Total Queued: 1
> [   21.086611] Queue-update: 1, go, Total Queued: 1
> [   21.206605] Queue-update: 1, go, Total Queued: 1
> [   21.326603] Queue-update: 1, go, Total Queued: 1
> [   21.462609] Queue-update: 1, go, Total Queued: 1
> [   21.566755] Queue-update: 1, go, Total Queued: 1
> [   21.686606] Queue-update: 1, go, Total Queued: 1
> [   21.806611] Queue-update: 1, go, Total Queued: 1
> [   21.926609] Queue-update: 1, go, Total Queued: 1
> [   22.046621] Queue-update: 1, go, Total Queued: 1
> [   22.166729] Queue-update: 1, go, Total Queued: 1
> [   22.302609] Queue-update: 1, go, Total Queued: 1
> [   22.510627] Queue-update: 1, go, Total Queued: 1
> [   22.536638] Queue-update: 1, go, Total Queued: 1
> [   22.654618] Queue-update: 1, go, Total Queued: 1
> [   22.774643] Queue-update: 1, go, Total Queued: 1
> [   22.902609] Queue-update: 1, go, Total Queued: 1
> [   23.022608] Queue-update: 1, go, Total Queued: 1
> [   23.158606] Queue-update: 1, go, Total Queued: 1
> [   23.254618] Queue-update: 1, go, Total Queued: 1
> [   23.374647] Queue-update: 1, go, Total Queued: 1
> [   23.494605] Queue-update: 1, go, Total Queued: 1
> [   23.614610] Queue-update: 1, go, Total Queued: 1
> [   23.734606] Queue-update: 1, go, Total Queued: 1
> [   23.854609] Queue-update: 1, go, Total Queued: 1
> [   23.974615] Queue-update: 1, go, Total Queued: 1
> [   24.094609] Queue-update: 1, go, Total Queued: 1
> [   24.230607] Queue-update: 1, go, Total Queued: 1
> [   24.342625] Queue-update: 1, go, Total Queued: 1
> [   24.475478] Queue-update: 1, go, Total Queued: 1
> [   24.574554] Queue-update: 1, go, Total Queued: 1
> [   24.694605] Queue-update: 1, go, Total Queued: 1
> [   24.814585] Queue-update: 1, go, Total Queued: 1
> [   24.934591] Queue-update: 1, go, Total Queued: 1
> [   25.054614] Queue-update: 1, go, Total Queued: 1
> [   25.174628] Queue-update: 1, go, Total Queued: 1
> [   25.294637] Queue-update: 1, go, Total Queued: 1
> [   25.414620] Queue-update: 1, go, Total Queued: 1
> [   25.534553] Queue-update: 1, go, Total Queued: 1
> [   25.654610] Queue-update: 1, go, Total Queued: 1
> [   25.774627] Queue-update: 1, go, Total Queued: 1
> [   25.894609] Queue-update: 1, go, Total Queued: 1
> [   26.030548] Queue-update: 1, go, Total Queued: 1
> [   26.134626] Queue-update: 1, go, Total Queued: 1
> [   26.254620] Queue-update: 1, go, Total Queued: 1
> [   26.373603] Queue-update: 1, go, Total Queued: 1
> [   26.502606] Queue-update: 1, go, Total Queued: 1
> [   26.614607] Queue-update: 1, go, Total Queued: 1
> [   26.734610] Queue-update: 1, go, Total Queued: 1
> [   26.854620] Queue-update: 1, go, Total Queued: 1
> [   26.974660] Queue-update: 1, go, Total Queued: 1
> [   27.094620] Queue-update: 1, go, Total Queued: 1
> [   27.214596] Queue-update: 1, go, Total Queued: 1
> [   27.334640] Queue-update: 1, go, Total Queued: 1
> [   27.494606] Queue-update: 1, go, Total Queued: 1
> [   27.574703] Queue-update: 1, go, Total Queued: 1
> [   27.694609] Queue-update: 1, go, Total Queued: 1
> [   27.814607] Queue-update: 1, go, Total Queued: 1
> [   27.934631] Queue-update: 1, go, Total Queued: 1
> [   28.054606] Queue-update: 1, go, Total Queued: 1
> [   28.182685] Queue-update: 1, go, Total Queued: 1
> [   28.318568] Queue-update: 1, go, Total Queued: 1
> [   28.422610] Queue-update: 1, go, Total Queued: 1
> [   28.534621] Queue-update: 1, go, Total Queued: 1
> [   28.654618] Queue-update: 1, go, Total Queued: 1
> [   28.774622] Queue-update: 1, go, Total Queued: 1
> [   28.894609] Queue-update: 1, go, Total Queued: 1
> [   29.022609] Queue-update: 1, go, Total Queued: 1
> [   29.150615] Queue-update: 1, go, Total Queued: 1
> [   29.262610] Queue-update: 1, go, Total Queued: 1
> [   29.374621] Queue-update: 1, go, Total Queued: 1
> [   29.494606] Queue-update: 1, go, Total Queued: 1
> [   29.614616] Queue-update: 1, go, Total Queued: 1
> [   29.734607] Queue-update: 1, go, Total Queued: 1
> [   29.854601] Queue-update: 1, go, Total Queued: 1
> [   29.974610] Queue-update: 1, go, Total Queued: 1
> [   30.094625] Queue-update: 1, go, Total Queued: 1
> [   30.214606] Queue-update: 1, go, Total Queued: 1
> [   30.334602] Queue-update: 1, go, Total Queued: 1
> [   30.454634] Queue-update: 1, go, Total Queued: 1
> [   30.574606] Queue-update: 1, go, Total Queued: 1
> [   30.694589] Queue-update: 1, go, Total Queued: 1
> [   30.814613] Queue-update: 1, go, Total Queued: 1
> [   30.934602] Queue-update: 1, go, Total Queued: 1
> [   31.054605] Queue-update: 1, go, Total Queued: 1
> [   31.182596] Queue-update: 1, go, Total Queued: 1
> [   31.318621] Queue-update: 1, go, Total Queued: 1
> [   31.414615] Queue-update: 1, go, Total Queued: 1
> [   31.534610] Queue-update: 1, go, Total Queued: 1
> [   31.670608] Queue-update: 1, go, Total Queued: 1
> [   31.774626] Queue-update: 1, go, Total Queued: 1
> [   31.894607] Queue-update: 1, go, Total Queued: 1
> [   32.014609] Queue-update: 1, go, Total Queued: 1
> [   32.134607] Queue-update: 1, go, Total Queued: 1
> [   32.254611] Queue-update: 1, go, Total Queued: 1
> [   32.374608] Queue-update: 1, go, Total Queued: 1
> [   32.494619] Queue-update: 1, go, Total Queued: 1
> [   32.614607] Queue-update: 1, go, Total Queued: 1
> [   32.734612] Queue-update: 1, go, Total Queued: 1
> [   32.862616] Queue-update: 1, go, Total Queued: 1
> [   32.974620] Queue-update: 1, go, Total Queued: 1
> [   33.110609] Queue-update: 1, go, Total Queued: 1
> [   33.214609] Queue-update: 1, go, Total Queued: 1
> [   33.342611] Queue-update: 1, go, Total Queued: 1
> [   33.454607] Queue-update: 1, go, Total Queued: 1
> [   33.574611] Queue-update: 1, go, Total Queued: 1
> [   33.694619] Queue-update: 1, go, Total Queued: 1
> [   33.814607] Queue-update: 1, go, Total Queued: 1
> [   33.950614] Queue-update: 1, go, Total Queued: 1
> [   34.062609] Queue-update: 1, go, Total Queued: 1
> [   34.174609] Queue-update: 1, go, Total Queued: 1
> [   34.294619] Queue-update: 1, go, Total Queued: 1
> [   34.430533] Queue-update: 1, go, Total Queued: 1
> [   34.534594] Queue-update: 1, go, Total Queued: 1
> [   34.654605] Queue-update: 1, go, Total Queued: 1
> [   34.790596] Queue-update: 1, go, Total Queued: 1
> [   34.902611] Queue-update: 1, go, Total Queued: 1
> [   35.014630] Queue-update: 1, go, Total Queued: 1
> [   35.134634] Queue-update: 1, go, Total Queued: 1
> [   35.262608] Queue-update: 1, go, Total Queued: 1
> [   35.374634] Queue-update: 1, go, Total Queued: 1
> [   35.494617] Queue-update: 1, go, Total Queued: 1
> [   35.622608] Queue-update: 1, go, Total Queued: 1
> [   35.742610] Queue-update: 1, go, Total Queued: 1
> [   35.854572] Queue-update: 1, go, Total Queued: 1
> [   35.982596] Queue-update: 1, go, Total Queued: 1
> [   36.094603] Queue-update: 1, go, Total Queued: 1
> [   36.222612] Queue-update: 1, go, Total Queued: 1
> [   36.334610] Queue-update: 1, go, Total Queued: 1
> [   36.454619] Queue-update: 1, go, Total Queued: 1
> [   36.574619] Queue-update: 1, go, Total Queued: 1
> [   36.694643] Queue-update: 1, go, Total Queued: 1
> [   36.814614] Queue-update: 1, go, Total Queued: 1
> [   36.934610] Queue-update: 1, go, Total Queued: 1
> [   37.062609] Queue-update: 1, go, Total Queued: 1
> [   37.198611] Queue-update: 1, go, Total Queued: 1
> [   37.294618] Queue-update: 1, go, Total Queued: 1
> [   37.414618] Queue-update: 1, go, Total Queued: 1
> [   37.534595] Queue-update: 1, go, Total Queued: 1
> [   37.662594] Queue-update: 1, go, Total Queued: 1
> [   37.774610] Queue-update: 1, go, Total Queued: 1
> [   37.894618] Queue-update: 1, go, Total Queued: 1
> [   38.014619] Queue-update: 1, go, Total Queued: 1
> [   38.142612] Queue-update: 1, go, Total Queued: 1
> [   38.254609] Queue-update: 1, go, Total Queued: 1
> [   38.374619] Queue-update: 1, go, Total Queued: 1
> [   38.502481] Queue-update: 1, go, Total Queued: 1
> [   38.614596] Queue-update: 1, go, Total Queued: 1
> [   38.734597] Queue-update: 1, go, Total Queued: 1
> [   38.854606] Queue-update: 1, go, Total Queued: 1
> [   38.974620] Queue-update: 1, go, Total Queued: 1
> [   39.094617] Queue-update: 1, go, Total Queued: 1
> [   39.222597] Queue-update: 1, go, Total Queued: 1
> [   39.334610] Queue-update: 1, go, Total Queued: 1
> [   39.454609] Queue-update: 1, go, Total Queued: 1
> [   39.574633] Queue-update: 1, go, Total Queued: 1
> [   39.694611] Queue-update: 1, go, Total Queued: 1
> [   39.814608] Queue-update: 1, go, Total Queued: 1
> [   39.934610] Queue-update: 1, go, Total Queued: 1
> [   40.054621] Queue-update: 1, go, Total Queued: 1
> [   40.174611] Queue-update: 1, go, Total Queued: 1
> [   40.297471] Queue-update: 1, go, Total Queued: 1
> [   40.414504] Queue-update: 1, go, Total Queued: 1
> [   40.534601] Queue-update: 1, go, Total Queued: 1
> [   40.654611] Queue-update: 1, go, Total Queued: 1
> [   40.774609] Queue-update: 1, go, Total Queued: 1
> [   40.894628] Queue-update: 1, go, Total Queued: 1
> [   41.014608] Queue-update: 1, go, Total Queued: 1
> [   41.135594] Queue-update: 1, go, Total Queued: 1
> real	0m 21.40s
> user	0m 0.00s
> sys	0m 0.04s

So 21.4s/256 = 84 milliseconds per event disable, correct?

It might be worth trying synchronize_rcu_expedited() just as an experiment
to see if it speeds things up significantly.

							Thanx, Paul

> Remove events
> real	0m 2.14s
> user	0m 0.00s
> sys	0m 0.04s
> 
> 
> Thank you,
> 
> 
> -- 
> Masami Hiramatsu <mhiramat@kernel.org>

  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16 14:44 [RFT PATCH 00/13] tracing: kprobes: Introduce async unregistration Masami Hiramatsu
2020-01-16 14:44 ` [RFT PATCH 01/13] kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex Masami Hiramatsu
2020-01-16 14:44 ` [RFT PATCH 02/13] kprobes: Remove redundant arch_disarm_kprobe() call Masami Hiramatsu
2020-01-16 14:44 ` [RFT PATCH 03/13] kprobes: Postpone optimizer until a bunch of probes (un)registered Masami Hiramatsu
2020-01-16 14:44 ` [RFT PATCH 04/13] kprobes: Make optimizer delay to 1 second Masami Hiramatsu
2020-01-22  0:29   ` Steven Rostedt
2020-01-22  7:23     ` Masami Hiramatsu
2020-01-22 12:11       ` Steven Rostedt
2020-01-22 13:12         ` Masami Hiramatsu
2020-01-22 16:54           ` Paul E. McKenney
2020-01-23  1:33             ` Masami Hiramatsu
2020-01-23  2:26               ` Paul E. McKenney [this message]
2020-01-23  6:13                 ` Masami Hiramatsu
2020-01-23 10:03                   ` Paul E. McKenney
2020-01-16 14:45 ` [RFT PATCH 05/13] tracing/kprobe: Use call_rcu to defer freeing event_file_link Masami Hiramatsu
2020-01-27 15:02   ` kbuild test robot
2020-01-27 15:02   ` [RFC PATCH] tracing/kprobe: trace_kprobe_disabled_finished can be static kbuild test robot
2020-01-28 15:02     ` Masami Hiramatsu
2020-01-16 14:45 ` [RFT PATCH 06/13] kprobes: Enable kprobe-booster with CONFIG_PREEMPT=y Masami Hiramatsu
2020-01-16 14:45 ` [RFT PATCH 07/13] kprobes: Use normal list traversal API if a mutex is held Masami Hiramatsu
2020-01-16 14:45 ` [RFT PATCH 08/13] kprobes: Use workqueue for reclaiming kprobe insn cache pages Masami Hiramatsu
2020-01-16 14:45 ` [RFT PATCH 09/13] kprobes: Free kprobe_insn_page asynchronously Masami Hiramatsu
2020-01-16 14:45 ` [RFT PATCH 10/13] kprobes: Make free_*insn_slot() mutex-less Masami Hiramatsu
2020-01-16 14:46 ` [RFT PATCH 11/13] kprobes: Add asynchronous unregistration APIs Masami Hiramatsu
2020-01-16 14:46 ` [RFT PATCH 12/13] tracing/kprobe: Free probe event asynchronously Masami Hiramatsu
2020-01-16 14:46 ` [RFT PATCH 13/13] tracing/kprobe: perf_event: Remove local kprobe " Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200123022647.GO2935@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=acme@kernel.org \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=naveen.n.rao@linux.ibm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git