LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
To: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	Sandeepa Prabhu <sandeepa.prabhu@linaro.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	x86@kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	fche@redhat.com, mingo@redhat.com, systemtap@sourceware.org,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH -tip v9 24/26] kprobes: Enlarge hash table to 512 entries
Date: Thu, 17 Apr 2014 17:19:24 +0900
Message-ID: <20140417081924.26341.23535.stgit@ltc230.yrl.intra.hitachi.co.jp> (raw)
In-Reply-To: <20140417081636.26341.87858.stgit@ltc230.yrl.intra.hitachi.co.jp>

Currently, since the kprobes expects to be used with less than
100 probe points, its hash table just has 64 entries. This is
too little to handle several thousands of probes.
Enlarge the size of kprobe_table to 512 entries which just
consumes 4KB (on 64bit arch) for better scalability.

Note that this also decouples the hash sizes of kprobe_table and
kretprobe_inst_table/locks, since those use different sources
for hash (kprobe uses probed address, whereas kretprobe uses
task structure.)

Without this patch, enabling 17787 probes takes more than 2 hours!
(9428sec, 1 min intervals for each 2000 probes enabled)

  Enabling trace events: start at 1392782584
  0 1392782585 a2mp_chan_alloc_skb_cb_38556
  1 1392782585 a2mp_chan_close_cb_38555
  ....
  17785 1392792008 lookup_vport_34987
  17786 1392792010 loop_add_23485
  17787 1392792012 loop_attr_do_show_autoclear_23464

I profiled it and saw that more than 90% of cycles are consumed
on get_kprobe.

  Samples: 18K of event 'cycles', Event count (approx.): 37759714934
  +  95.90%  [k] get_kprobe
  +   0.76%  [k] ftrace_lookup_ip
  +   0.54%  [k] kprobe_trace_func

And also more than 60% of executed instructions were in get_kprobe
too.

  Samples: 17K of event 'instructions', Event count (approx.): 1321391290
  +  65.48%  [k] get_kprobe
  +   4.07%  [k] kprobe_trace_func
  +   2.93%  [k] optimized_callback


And annotating get_kprobe also shows the hlist is too long and takes
a time on tracking it. Thus I guess it mostly comes from tracking
too long hlist at the table entry.

       |            struct hlist_head *head;
       |            struct kprobe *p;
       |
       |            head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)];
       |            hlist_for_each_entry_rcu(p, head, hlist) {
 86.33 |      mov    (%rax),%rax
 11.24 |      test   %rax,%rax
       |      jne    60
       |                    if (p->addr == addr)
       |                            return p;
       |            }

To fix this issue, I tried to enlarge the size of kprobe_table
from 6(current) to 12, and profiled cycles% on the get_kprobe
with 10k probes enabled / 37k probes registered.

  Size  Cycles%
  2^6   95.58%
  2^7   85.83%
  2^8   68.43%
  2^9   48.61%
  2^10  46.95%
  2^11  48.46%
  2^12  56.95%

Here, we can see the hash tables larger than 2^9 (512 entries)
have no much performance improvement. So I decided to enlarge
it to 512 entries.

With this fix, enabling 20,000 probes just took
about 45 min (2934 sec, 1 min intervals for
each 2000 probes enabled)

  Enabling trace events: start at 1393921862
  0 1393921864 a2mp_chan_alloc_skb_cb_38581
  1 1393921864 a2mp_chan_close_cb_38580
  ...
  19997 1393924927 nfs4_match_stateid_11750
  19998 1393924927 nfs4_negotiate_security_12137
  19999 1393924928 nfs4_open_confirm_done_11785

And it reduced cycles on get_kprobe (with 20,000 probes).

  Samples: 691  of event 'cycles', Event count (approx.): 1743375714
  +  67.68%  [k] get_kprobe
  +   5.98%  [k] ftrace_lookup_ip
  +   4.03%  [k] kprobe_trace_func

Changes from v7:
 - Evaluate the size of hash table and reduce it to 512 entries.
 - Decouple the hash size of kprobe_table from others.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 kernel/kprobes.c |   23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index abdede5..a29e622 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -54,9 +54,10 @@
 #include <asm/errno.h>
 #include <asm/uaccess.h>
 
-#define KPROBE_HASH_BITS 6
+#define KPROBE_HASH_BITS 9
+#define KRETPROBE_HASH_BITS 6
 #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS)
-
+#define KRETPROBE_TABLE_SIZE (1 << KRETPROBE_HASH_BITS)
 
 /*
  * Some oddball architectures like 64bit powerpc have function descriptors
@@ -69,7 +70,7 @@
 
 static int kprobes_initialized;
 static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
-static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
+static struct hlist_head kretprobe_inst_table[KRETPROBE_TABLE_SIZE];
 
 /* NOTE: change this value only with kprobe_mutex held */
 static bool kprobes_all_disarmed;
@@ -79,7 +80,7 @@ static DEFINE_MUTEX(kprobe_mutex);
 static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL;
 static struct {
 	raw_spinlock_t lock ____cacheline_aligned_in_smp;
-} kretprobe_table_locks[KPROBE_TABLE_SIZE];
+} kretprobe_table_locks[KRETPROBE_TABLE_SIZE];
 
 static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
 {
@@ -1096,7 +1097,7 @@ void kretprobe_hash_lock(struct task_struct *tsk,
 			 struct hlist_head **head, unsigned long *flags)
 __acquires(hlist_lock)
 {
-	unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
+	unsigned long hash = hash_ptr(tsk, KRETPROBE_HASH_BITS);
 	raw_spinlock_t *hlist_lock;
 
 	*head = &kretprobe_inst_table[hash];
@@ -1118,7 +1119,7 @@ void kretprobe_hash_unlock(struct task_struct *tsk,
 			   unsigned long *flags)
 __releases(hlist_lock)
 {
-	unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
+	unsigned long hash = hash_ptr(tsk, KRETPROBE_HASH_BITS);
 	raw_spinlock_t *hlist_lock;
 
 	hlist_lock = kretprobe_table_lock_ptr(hash);
@@ -1153,7 +1154,7 @@ void kprobe_flush_task(struct task_struct *tk)
 		return;
 
 	INIT_HLIST_HEAD(&empty_rp);
-	hash = hash_ptr(tk, KPROBE_HASH_BITS);
+	hash = hash_ptr(tk, KRETPROBE_HASH_BITS);
 	head = &kretprobe_inst_table[hash];
 	kretprobe_table_lock(hash, &flags);
 	hlist_for_each_entry_safe(ri, tmp, head, hlist) {
@@ -1187,7 +1188,7 @@ static void cleanup_rp_inst(struct kretprobe *rp)
 	struct hlist_head *head;
 
 	/* No race here */
-	for (hash = 0; hash < KPROBE_TABLE_SIZE; hash++) {
+	for (hash = 0; hash < KRETPROBE_TABLE_SIZE; hash++) {
 		kretprobe_table_lock(hash, &flags);
 		head = &kretprobe_inst_table[hash];
 		hlist_for_each_entry_safe(ri, next, head, hlist) {
@@ -1778,7 +1779,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 	struct kretprobe_instance *ri;
 
 	/*TODO: consider to only swap the RA after the last pre_handler fired */
-	hash = hash_ptr(current, KPROBE_HASH_BITS);
+	hash = hash_ptr(current, KRETPROBE_HASH_BITS);
 	raw_spin_lock_irqsave(&rp->lock, flags);
 	if (!hlist_empty(&rp->free_instances)) {
 		ri = hlist_entry(rp->free_instances.first,
@@ -2164,8 +2165,10 @@ static int __init init_kprobes(void)
 
 	/* FIXME allocate the probe table, currently defined statically */
 	/* initialize all list heads */
-	for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
+	for (i = 0; i < KPROBE_TABLE_SIZE; i++)
 		INIT_HLIST_HEAD(&kprobe_table[i]);
+
+	for (i = 0; i < KRETPROBE_TABLE_SIZE; i++) {
 		INIT_HLIST_HEAD(&kretprobe_inst_table[i]);
 		raw_spin_lock_init(&(kretprobe_table_locks[i].lock));
 	}



  parent reply index

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-17  8:16 [PATCH -tip v9 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts Masami Hiramatsu
2014-04-17  8:16 ` [PATCH -tip v9 01/26] [BUGFIX]kprobes/x86: Fix page-fault handling logic Masami Hiramatsu
2014-04-17  9:58   ` [tip:perf/urgent] kprobes/x86: " tip-bot for Masami Hiramatsu
2014-04-17  8:16 ` [PATCH -tip v9 02/26] kprobes/x86: Allow to handle reentered kprobe on singlestepping Masami Hiramatsu
2014-04-24 10:57   ` [tip:perf/kprobes] kprobes/x86: Allow to handle reentered kprobe on single-stepping tip-bot for Masami Hiramatsu
2014-04-17  8:16 ` [PATCH -tip v9 03/26] kprobes: Prohibit probing on .entry.text code Masami Hiramatsu
2014-04-24 10:57   ` [tip:perf/kprobes] " tip-bot for Masami Hiramatsu
2014-04-17  8:17 ` [PATCH -tip v9 04/26] kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist Masami Hiramatsu
2014-04-24 10:58   ` [tip:perf/kprobes] kprobes: Introduce NOKPROBE_SYMBOL() macro to maintain kprobes blacklist tip-bot for Masami Hiramatsu
2014-05-01  5:26     ` kprobes broken in linux-next (was Re: [tip:perf/kprobes] kprobes: Introduce NOKPROBE_SYMBOL() macro to maintain kprobes blacklist) Vineet Gupta
2014-05-02  1:13       ` Masami Hiramatsu
2014-05-07  4:56         ` Vineet Gupta
2014-05-07 19:18       ` [tip:perf/kprobes] kprobes: Ensure blacklist data is aligned tip-bot for Vineet Gupta
2014-05-05 20:48     ` [tip:perf/kprobes] kprobes: Introduce NOKPROBE_SYMBOL() macro to maintain kprobes blacklist Tony Luck
2014-05-06  9:25       ` Masami Hiramatsu
2014-05-06 10:03       ` Masami Hiramatsu
2014-05-07 11:19         ` Masami Hiramatsu
2014-05-07 11:55           ` [RFT PATCH -next ] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64 Masami Hiramatsu
2014-05-07 11:59             ` Masami Hiramatsu
2014-05-14  8:19               ` Masami Hiramatsu
2014-05-08  4:47             ` Ananth N Mavinakayanahalli
2014-05-08  5:40               ` Masami Hiramatsu
2014-05-08  6:16                 ` Ananth N Mavinakayanahalli
2014-05-09  8:06                   ` Masami Hiramatsu
2014-05-26 11:25             ` Suzuki K. Poulose
2014-05-26 11:48               ` Masami Hiramatsu
2014-05-27  6:31               ` [RFT PATCH -next v2] " Masami Hiramatsu
2014-05-29 19:13                 ` Suzuki K. Poulose
2014-05-30  2:47                   ` Masami Hiramatsu
2014-05-30  3:18                     ` [RFT PATCH -next v3] " Masami Hiramatsu
2014-06-06  6:38                       ` Masami Hiramatsu
2014-06-17 23:03                         ` Tony Luck
2014-06-18  7:56                         ` Michael Ellerman
2014-06-18  8:46                           ` Masami Hiramatsu
2014-06-19  1:30                             ` Michael Ellerman
2014-06-19  4:52                               ` Masami Hiramatsu
2014-06-19  6:40                                 ` Suzuki K. Poulose
2014-06-19  7:26                                   ` Masami Hiramatsu
2014-06-19  9:45                                     ` Suzuki K. Poulose
2014-06-19 11:01                                       ` Masami Hiramatsu
2014-06-19 11:20                                         ` Masami Hiramatsu
2014-06-20  0:37                                           ` Michael Ellerman
2014-06-20  2:13                                             ` Masami Hiramatsu
2014-04-17  8:17 ` [PATCH -tip v9 05/26] [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_* Masami Hiramatsu
2014-04-24 10:58   ` [tip:perf/kprobes] kprobes, x86: Prohibit probing on debug_stack_*() tip-bot for Masami Hiramatsu
2014-04-17  8:17 ` [PATCH -tip v9 06/26] [BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt Masami Hiramatsu
2014-04-24 10:58   ` [tip:perf/kprobes] kprobes, x86: Prohibit probing on native_set_debugreg()/load_idt() tip-bot for Masami Hiramatsu
2014-04-17  8:17 ` [PATCH -tip v9 07/26] [BUGFIX] x86: Prohibit probing on thunk functions and restore Masami Hiramatsu
2014-04-24 10:58   ` [tip:perf/kprobes] kprobes, " tip-bot for Masami Hiramatsu
2014-04-17  8:17 ` [PATCH -tip v9 08/26] kprobes/x86: Call exception handlers directly from do_int3/do_debug Masami Hiramatsu
2014-04-24 10:59   ` [tip:perf/kprobes] " tip-bot for Masami Hiramatsu
2014-04-24 11:26     ` Jiri Kosina
2014-04-17  8:17 ` [PATCH -tip v9 09/26] x86: Call exception_enter after kprobes handled Masami Hiramatsu
2014-04-24 10:59   ` [tip:perf/kprobes] kprobes, " tip-bot for Masami Hiramatsu
2014-06-13 17:14     ` Frederic Weisbecker
2014-06-14  5:44       ` Masami Hiramatsu
2014-06-14  6:47         ` [PATCH -tip ] [Bugfix] x86/kprobes: Fix build errors and blacklist context_track_user Masami Hiramatsu
2014-06-14  8:58           ` [tip:perf/urgent] " tip-bot for Masami Hiramatsu
2014-06-16 15:52           ` [PATCH -tip ] [Bugfix] " Frederic Weisbecker
2014-04-17  8:17 ` [PATCH -tip v9 10/26] kprobes/x86: Allow probe on some kprobe preparation functions Masami Hiramatsu
2014-04-24 10:59   ` [tip:perf/kprobes] " tip-bot for Masami Hiramatsu
2014-04-17  8:17 ` [PATCH -tip v9 11/26] kprobes: Allow probe on some kprobe functions Masami Hiramatsu
2014-04-24 10:59   ` [tip:perf/kprobes] " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 12/26] ftrace/*probes: Allow probing on some functions Masami Hiramatsu
2014-04-24 10:59   ` [tip:perf/kprobes] kprobes, ftrace: " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 13/26] x86: Allow kprobes on text_poke/hw_breakpoint Masami Hiramatsu
2014-04-24 11:00   ` [tip:perf/kprobes] kprobes, x86: Allow kprobes on text_poke/ hw_breakpoint tip-bot for Masami Hiramatsu
2014-04-24 11:26     ` Jiri Kosina
2014-04-17  8:18 ` [PATCH -tip v9 14/26] x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation Masami Hiramatsu
2014-04-24 11:00   ` [tip:perf/kprobes] kprobes, " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 15/26] kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes Masami Hiramatsu
2014-04-24 11:00   ` [tip:perf/kprobes] " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 16/26] ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace Masami Hiramatsu
2014-04-24 11:00   ` [tip:perf/kprobes] kprobes, ftrace: " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 17/26] notifier: Use NOKPROBE_SYMBOL macro in notifier Masami Hiramatsu
2014-04-17 14:40   ` Josh Triplett
2014-04-24 11:00   ` [tip:perf/kprobes] kprobes, " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 18/26] sched: Use NOKPROBE_SYMBOL macro in sched Masami Hiramatsu
2014-04-24 11:01   ` [tip:perf/kprobes] kprobes, " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 19/26] kprobes: Show blacklist entries via debugfs Masami Hiramatsu
2014-04-24 11:01   ` [tip:perf/kprobes] " tip-bot for Masami Hiramatsu
2014-04-17  8:18 ` [PATCH -tip v9 20/26] kprobes: Support blacklist functions in module Masami Hiramatsu
2014-04-24  8:56   ` Ingo Molnar
2014-04-24 11:24     ` Masami Hiramatsu
2014-04-25  8:19       ` Ingo Molnar
2014-04-25 10:12         ` Masami Hiramatsu
2014-04-25 10:55           ` Masami Hiramatsu
2014-04-17  8:19 ` [PATCH -tip v9 21/26] kprobes: Use NOKPROBE_SYMBOL() in sample modules Masami Hiramatsu
2014-04-17  8:19 ` [PATCH -tip v9 22/26] kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text Masami Hiramatsu
2014-04-24  8:58   ` Ingo Molnar
2014-04-24 11:22     ` Masami Hiramatsu
2014-04-17  8:19 ` [PATCH -tip v9 23/26] kprobes/x86: Remove unneeded preempt_disable/enable in interrupt handlers Masami Hiramatsu
2014-04-17  8:19 ` Masami Hiramatsu [this message]
2014-04-17  8:19 ` [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits Masami Hiramatsu
2014-04-24  9:01   ` Ingo Molnar
2014-04-24 11:38     ` Masami Hiramatsu
2014-04-25  8:20       ` Ingo Molnar
2014-04-25  9:43         ` Masami Hiramatsu
2014-04-26  7:12           ` Ingo Molnar
2014-04-27 12:49             ` Masami Hiramatsu
2014-04-17  8:19 ` [PATCH -tip v9 26/26] ftrace: Introduce FTRACE_OPS_FL_SELF_FILTER for ftrace-kprobe Masami Hiramatsu
2014-04-17  8:37 ` [PATCH -tip v9 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts Ingo Molnar
2014-04-17  8:53   ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140417081924.26341.23535.stgit@ltc230.yrl.intra.hitachi.co.jp \
    --to=masami.hiramatsu.pt@hitachi.com \
    --cc=ananth@in.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=fche@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=sandeepa.prabhu@linaro.org \
    --cc=systemtap@sourceware.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git