linux-modules.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols
@ 2022-09-08 13:09 Zhen Lei
  2022-09-08 13:09 ` [PATCH 1/7] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This is very slow.

In fact, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names'.

This patch series optimizes the performance of function kallsyms_lookup_name(),
and function klp_find_object_symbol() in the livepatch module. Based on the
test results, the performance overhead is reduced to 5%. That is, the
performance of these functions is improved by 20 times.

To avoid increasing the kernel size in non-debug mode, the optimization is only
for the case CONFIG_KALLSYMS_ALL=y.


Zhen Lei (7):
  scripts/kallsyms: don't compress symbol type when
    CONFIG_KALLSYMS_ALL=y
  scripts/kallsyms: rename build_initial_tok_table()
  kallsyms: Adjust the types of some local variables
  kallsyms: Improve the performance of kallsyms_lookup_name()
  kallsyms: Add helper kallsyms_on_each_match_symbol()
  livepatch: Use kallsyms_on_each_match_symbol() to improve performance
  livepatch: Improve the search performance of
    module_kallsyms_on_each_symbol()

 include/linux/kallsyms.h |   8 +++
 kernel/kallsyms.c        | 135 +++++++++++++++++++++++++++++++++++++--
 kernel/livepatch/core.c  |  25 ++++++--
 kernel/module/kallsyms.c |  13 +++-
 scripts/kallsyms.c       |  19 ++++--
 5 files changed, 184 insertions(+), 16 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/7] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
@ 2022-09-08 13:09 ` Zhen Lei
  2022-09-08 13:09 ` [PATCH 2/7] scripts/kallsyms: rename build_initial_tok_table() Zhen Lei
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This is very slow.

In fact, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names'.

This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86
with defconfig.
Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659
After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418

However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names'
does not change.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 scripts/kallsyms.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index f18e6dfc68c5839..ab6fe7cd014efd1 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -60,6 +60,7 @@ static unsigned int table_size, table_cnt;
 static int all_symbols;
 static int absolute_percpu;
 static int base_relative;
+static int sym_start_idx;
 
 static int token_profit[0x10000];
 
@@ -511,7 +512,7 @@ static void learn_symbol(const unsigned char *symbol, int len)
 {
 	int i;
 
-	for (i = 0; i < len - 1; i++)
+	for (i = sym_start_idx; i < len - 1; i++)
 		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]++;
 }
 
@@ -520,7 +521,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
 {
 	int i;
 
-	for (i = 0; i < len - 1; i++)
+	for (i = sym_start_idx; i < len - 1; i++)
 		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]--;
 }
 
@@ -538,7 +539,7 @@ static unsigned char *find_token(unsigned char *str, int len,
 {
 	int i;
 
-	for (i = 0; i < len - 1; i++) {
+	for (i = sym_start_idx; i < len - 1; i++) {
 		if (str[i] == token[0] && str[i+1] == token[1])
 			return &str[i];
 	}
@@ -780,6 +781,14 @@ int main(int argc, char **argv)
 	} else if (argc != 1)
 		usage();
 
+	/*
+	 * Skip the symbol type, do not compress it to optimize the performance
+	 * of finding or traversing symbols in kernel, this is good for modules
+	 * such as livepatch.
+	 */
+	if (all_symbols)
+		sym_start_idx = 1;
+
 	read_map(stdin);
 	shrink_table();
 	if (absolute_percpu)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/7] scripts/kallsyms: rename build_initial_tok_table()
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
  2022-09-08 13:09 ` [PATCH 1/7] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
@ 2022-09-08 13:09 ` Zhen Lei
  2022-09-08 13:09 ` [PATCH 3/7] kallsyms: Adjust the types of some local variables Zhen Lei
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Except for the function build_initial_tok_table(), no token abbreviation
is used elsewhere.

$ cat scripts/kallsyms.c | grep tok | wc -l
33
$ cat scripts/kallsyms.c | grep token | wc -l
31

Here, it would be clearer to use the full name.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 scripts/kallsyms.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index ab6fe7cd014efd1..678ebe7d4c1cc38 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -526,7 +526,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
 }
 
 /* do the initial token count */
-static void build_initial_tok_table(void)
+static void build_initial_token_table(void)
 {
 	unsigned int i;
 
@@ -651,7 +651,7 @@ static void insert_real_symbols_in_table(void)
 
 static void optimize_token_table(void)
 {
-	build_initial_tok_table();
+	build_initial_token_table();
 
 	insert_real_symbols_in_table();
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/7] kallsyms: Adjust the types of some local variables
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
  2022-09-08 13:09 ` [PATCH 1/7] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
  2022-09-08 13:09 ` [PATCH 2/7] scripts/kallsyms: rename build_initial_tok_table() Zhen Lei
@ 2022-09-08 13:09 ` Zhen Lei
  2022-09-08 13:09 ` [PATCH 4/7] kallsyms: Improve the performance of kallsyms_lookup_name() Zhen Lei
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

The type of kallsyms_num_syms is 'unsigned int', adjust the type of
local variables associated with it for indexing, so that their types
are consistent.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/kallsyms.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 3e7e2c2ad2f75ef..9dd4774b6c6edf6 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -190,8 +190,7 @@ static bool cleanup_symbol_name(char *s)
 unsigned long kallsyms_lookup_name(const char *name)
 {
 	char namebuf[KSYM_NAME_LEN];
-	unsigned long i;
-	unsigned int off;
+	unsigned int i, off;
 
 	/* Skip the search for empty string. */
 	if (!*name)
@@ -218,8 +217,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 			    void *data)
 {
 	char namebuf[KSYM_NAME_LEN];
-	unsigned long i;
-	unsigned int off;
+	unsigned int i, off;
 	int ret;
 
 	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
@@ -237,7 +235,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
 				    unsigned long *offset)
 {
 	unsigned long symbol_start = 0, symbol_end = 0;
-	unsigned long i, low, high, mid;
+	unsigned int i, low, high, mid;
 
 	/* This kernel should never had been booted. */
 	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/7] kallsyms: Improve the performance of kallsyms_lookup_name()
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (2 preceding siblings ...)
  2022-09-08 13:09 ` [PATCH 3/7] kallsyms: Adjust the types of some local variables Zhen Lei
@ 2022-09-08 13:09 ` Zhen Lei
  2022-09-08 13:09 ` [PATCH 5/7] kallsyms: Add helper kallsyms_on_each_match_symbol() Zhen Lei
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This is very slow.

In fact, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names'.

This requires CONFIG_KALLSYMS_ALL=y.

The pseudo code of the test case is as follows:
static int stat_find_name(...)
{
	start = sched_clock();
	(void)kallsyms_lookup_name(name);
	end = sched_clock();
	//Update min, max, cnt, sum
}

/*
 * Traverse all symbols in sequence and collect statistics on the time
 * taken by kallsyms_lookup_name() to lookup each symbol.
 */
kallsyms_on_each_symbol(stat_find_name, NULL);

The test results are as follows (twice):
After : min=7106, max=  564822, cnt=131392, avg= 247965
After : min=6971, max=  557676, cnt=131393, avg= 248350
Before: min= 682, max=23045734, cnt=131392, avg=6966802
Before: min= 647, max=17676731, cnt=131392, avg=6965314

The average time consumed is only 3.56% and the maximum time consumed is
only 2.76% of the time consumed before optimization.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/kallsyms.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 9dd4774b6c6edf6..e1cd7305aa5f548 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -87,6 +87,72 @@ static unsigned int kallsyms_expand_symbol(unsigned int off,
 	return off;
 }
 
+static int kallsyms_name_to_tokens(const char *name, char *buf)
+{
+	int i, j, k, n;
+	int len, token_len;
+	const char *token;
+	unsigned char token_idx[KSYM_NAME_LEN];
+	unsigned char token_bak[KSYM_NAME_LEN];
+
+	if (!IS_ENABLED(CONFIG_KALLSYMS_ALL))
+		return 0;
+
+	/*
+	 * n, number of tokens in the string name.
+	 * token_idx[i], the start index of the ith token.
+	 * token_idx[n] is used to calculate the length of the last token.
+	 */
+	n = strlen(name);
+	if (n >= KSYM_NAME_LEN)
+		return 0;
+	for (i = 0; i <= n; i++)
+		token_idx[i] = (unsigned char)i;
+
+	/*
+	 * For tokens whose token_len >= 2, a larger index value indicates
+	 * a higher occurrence frequency. See scripts/kallsyms.c
+	 */
+	for (i = 255; i >= 0; i--) {
+		token = &kallsyms_token_table[kallsyms_token_index[i]];
+		token_len = strlen(token);
+		if (token_len <= 1)
+			continue;
+
+		/*
+		 * Find and merge two tokens into one.
+		 *
+		 *                |<-- new_token -->|
+		 *                | token1 | token2 |
+		 * token_idx[]:   j       j+1      j+2
+		 *
+		 */
+		for (j = 0; j < n - 1; j++) {
+			len = token_idx[j + 2] - token_idx[j];
+			if (len == token_len &&
+			    !strncmp(name + token_idx[j], token, len)) {
+				token_bak[token_idx[j]] = (unsigned char)i;
+				for (k = j + 1; k < n; k++)
+					token_idx[k] = token_idx[k + 1];
+				n--;
+			}
+		}
+	}
+
+	for (j = 0; j < n; j++) {
+		len = token_idx[j + 1] - token_idx[j];
+		if (len <= 1) {
+			buf[j] = name[token_idx[j]];
+			continue;
+		}
+
+		buf[j] = token_bak[token_idx[j]];
+	}
+	buf[n] = 0;
+
+	return n;
+}
+
 /*
  * Get symbol type information. This is encoded as a single char at the
  * beginning of the symbol name.
@@ -191,11 +257,29 @@ unsigned long kallsyms_lookup_name(const char *name)
 {
 	char namebuf[KSYM_NAME_LEN];
 	unsigned int i, off;
+	int len;
 
 	/* Skip the search for empty string. */
 	if (!*name)
 		return 0;
 
+	len = kallsyms_name_to_tokens(name, namebuf);
+	if (!len)
+		goto slow_path;
+
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		if (kallsyms_names[off] == len + 1 &&
+		    !memcmp(&kallsyms_names[off + 2], namebuf, len)) {
+			return kallsyms_sym_address(i);
+		}
+
+		off += kallsyms_names[off] + 1;
+	}
+
+	if (!IS_ENABLED(CONFIG_LTO_CLANG))
+		goto module_lookup;
+
+slow_path:
 	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
 
@@ -205,6 +289,8 @@ unsigned long kallsyms_lookup_name(const char *name)
 		if (cleanup_symbol_name(namebuf) && strcmp(namebuf, name) == 0)
 			return kallsyms_sym_address(i);
 	}
+
+module_lookup:
 	return module_kallsyms_lookup_name(name);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/7] kallsyms: Add helper kallsyms_on_each_match_symbol()
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (3 preceding siblings ...)
  2022-09-08 13:09 ` [PATCH 4/7] kallsyms: Improve the performance of kallsyms_lookup_name() Zhen Lei
@ 2022-09-08 13:09 ` Zhen Lei
  2022-09-08 13:09 ` [PATCH 6/7] livepatch: Use kallsyms_on_each_match_symbol() to improve performance Zhen Lei
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Function kallsyms_on_each_symbol() traverses all symbols and submits each
symbol to the hook 'fn' for judgment and processing. For some cases, the
hook actually only handles the matched symbol, such as livepatch.

So that, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names', this greatly reduces
the time consumed by traversing.

This requires CONFIG_KALLSYMS_ALL=y, so that scripts/kallsyms does not
compress that type character of each symbol.

If CONFIG_KALLSYMS_ALL=n, the traversal of symbols is rolled back to the
mode before optimization.

The pseudo code of the test case is as follows:
static int tst_find(void *data, const char *name,
		    struct module *mod, unsigned long addr)
{
	if (strcmp(name, "vmap") == 0)
		*(unsigned long *)data = addr;
        return 0;
}

static int tst_match(void *data, unsigned long addr)
{
        *(unsigned long *)data = addr;
        return 0;
}

start = sched_clock();
kallsyms_on_each_match_symbol(tst_match, "vmap", &addr);
end = sched_clock();

start = sched_clock();
kallsyms_on_each_symbol(tst_find, &addr);
end = sched_clock();

The test results are as follows (twice):
kallsyms_on_each_match_symbol:  1058511,  1079288
kallsyms_on_each_symbol      : 26097313, 24765180

kallsyms_on_each_match_symbol() consumes only 4.2% of
kallsyms_on_each_symbol()'s time.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 include/linux/kallsyms.h |  8 ++++++++
 kernel/kallsyms.c        | 41 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index ad39636e0c3f122..f9f2cc084cab16b 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -69,6 +69,8 @@ static inline void *dereference_symbol_descriptor(void *ptr)
 int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 				      unsigned long),
 			    void *data);
+extern int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+					 const char *name, void *data);
 
 /* Lookup the address for a symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name);
@@ -168,6 +170,12 @@ static inline int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct
 {
 	return -EOPNOTSUPP;
 }
+
+static inline int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+						const char *name, void *data)
+{
+	return -EOPNOTSUPP;
+}
 #endif /*CONFIG_KALLSYMS*/
 
 static inline void print_ip_sym(const char *loglvl, unsigned long ip)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index e1cd7305aa5f548..9816a0ac30c8c48 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -316,6 +316,47 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 	return 0;
 }
 
+int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+				  const char *name, void *data)
+{
+	unsigned int i, off;
+	int len, ret;
+	char namebuf[KSYM_NAME_LEN];
+
+	len = kallsyms_name_to_tokens(name, namebuf);
+	if (!len)
+		goto slow_path;
+
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		if ((i & 0xfff) == 0)
+			cond_resched();
+
+		if ((kallsyms_names[off] == len + 1) &&
+		    !memcmp(&kallsyms_names[off + 2], namebuf, len)) {
+			ret = fn(data, kallsyms_sym_address(i));
+			if (ret != 0)
+				return ret;
+			cond_resched();
+		}
+		off += kallsyms_names[off] + 1;
+	}
+
+	return 0;
+
+slow_path:
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
+		if (!strcmp(name, namebuf)) {
+			ret = fn(data, kallsyms_sym_address(i));
+			if (ret != 0)
+				return ret;
+		}
+		cond_resched();
+	}
+
+	return 0;
+}
+
 static unsigned long get_symbol_pos(unsigned long addr,
 				    unsigned long *symbolsize,
 				    unsigned long *offset)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/7] livepatch: Use kallsyms_on_each_match_symbol() to improve performance
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (4 preceding siblings ...)
  2022-09-08 13:09 ` [PATCH 5/7] kallsyms: Add helper kallsyms_on_each_match_symbol() Zhen Lei
@ 2022-09-08 13:09 ` Zhen Lei
  2022-09-08 13:09 ` [PATCH 7/7] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
  2022-09-09  0:07 ` [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Luis Chamberlain
  7 siblings, 0 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Based on the test results of kallsyms_on_each_match_symbol() and
kallsyms_on_each_symbol(), this can reduce the performance overhead by
approximately 95%, if CONFIG_KALLSYMS_ALL=y.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/livepatch/core.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 42f7e716d56bf72..31b57ccf908017e 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -153,6 +153,24 @@ static int klp_find_callback(void *data, const char *name,
 	return 0;
 }
 
+static int klp_match_callback(void *data, unsigned long addr)
+{
+	struct klp_find_arg *args = data;
+
+	args->addr = addr;
+	args->count++;
+
+	/*
+	 * Finish the search when the symbol is found for the desired position
+	 * or the position is not defined for a non-unique symbol.
+	 */
+	if ((args->pos && (args->count == args->pos)) ||
+	    (!args->pos && (args->count > 1)))
+		return 1;
+
+	return 0;
+}
+
 static int klp_find_object_symbol(const char *objname, const char *name,
 				  unsigned long sympos, unsigned long *addr)
 {
@@ -167,7 +185,7 @@ static int klp_find_object_symbol(const char *objname, const char *name,
 	if (objname)
 		module_kallsyms_on_each_symbol(klp_find_callback, &args);
 	else
-		kallsyms_on_each_symbol(klp_find_callback, &args);
+		kallsyms_on_each_match_symbol(klp_match_callback, name, &args);
 
 	/*
 	 * Ensure an address was found. If sympos is 0, ensure symbol is unique;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/7] livepatch: Improve the search performance of module_kallsyms_on_each_symbol()
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (5 preceding siblings ...)
  2022-09-08 13:09 ` [PATCH 6/7] livepatch: Use kallsyms_on_each_match_symbol() to improve performance Zhen Lei
@ 2022-09-08 13:09 ` Zhen Lei
  2022-09-09  0:07 ` [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Luis Chamberlain
  7 siblings, 0 replies; 10+ messages in thread
From: Zhen Lei @ 2022-09-08 13:09 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Currently we traverse all symbols of all modules to find the specified
function for the specified module. But in reality, we just need to find
the given module and then traverse all the symbols in it.

In order to achieve this purpose, split the call to hook 'fn' into two
phases:
1. Finds the given module. Pass pointer 'mod'. Hook 'fn' directly returns
   the comparison result of the module name without comparing the function
   name.
2. Finds the given function in that module. Pass pointer 'mod = NULL'.
   Hook 'fn' skip the comparison of module name and directly compare
   function names.

Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
                |
Phase2:          -->f1-->f2-->f3

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/livepatch/core.c  |  7 ++-----
 kernel/module/kallsyms.c | 13 ++++++++++++-
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 31b57ccf908017e..98e23137e4133bc 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -130,15 +130,12 @@ static int klp_find_callback(void *data, const char *name,
 {
 	struct klp_find_arg *args = data;
 
-	if ((mod && !args->objname) || (!mod && args->objname))
-		return 0;
+	if (mod)
+		return strcmp(args->objname, mod->name);
 
 	if (strcmp(args->name, name))
 		return 0;
 
-	if (args->objname && strcmp(args->objname, mod->name))
-		return 0;
-
 	args->addr = addr;
 	args->count++;
 
diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
index f5c5c9175333df7..b033613e6c7e3bb 100644
--- a/kernel/module/kallsyms.c
+++ b/kernel/module/kallsyms.c
@@ -510,6 +510,11 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
 		if (mod->state == MODULE_STATE_UNFORMED)
 			continue;
 
+		/* check mod->name first */
+		ret = fn(data, NULL, mod, 0);
+		if (ret)
+			continue;
+
 		/* Use rcu_dereference_sched() to remain compliant with the sparse tool */
 		preempt_disable();
 		kallsyms = rcu_dereference_sched(mod->kallsyms);
@@ -522,10 +527,16 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
 				continue;
 
 			ret = fn(data, kallsyms_symbol_name(kallsyms, i),
-				 mod, kallsyms_symbol_value(sym));
+				 NULL, kallsyms_symbol_value(sym));
 			if (ret != 0)
 				goto out;
 		}
+
+		/*
+		 * The given module is found, the subsequent modules do not
+		 * need to be compared.
+		 */
+		break;
 	}
 out:
 	mutex_unlock(&module_mutex);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols
  2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (6 preceding siblings ...)
  2022-09-08 13:09 ` [PATCH 7/7] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
@ 2022-09-09  0:07 ` Luis Chamberlain
  2022-09-09  1:17   ` Leizhen (ThunderTown)
  7 siblings, 1 reply; 10+ messages in thread
From: Luis Chamberlain @ 2022-09-09  0:07 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	linux-modules

On Thu, Sep 08, 2022 at 09:09:29PM +0800, Zhen Lei wrote:
> Currently, to search for a symbol, we need to expand the symbols in
> 'kallsyms_names' one by one, and then use the expanded string for
> comparison. This is very slow.
> 
> In fact, we can first compress the name being looked up and then use
> it for comparison when traversing 'kallsyms_names'.
> 
> This patch series optimizes the performance of function kallsyms_lookup_name(),
> and function klp_find_object_symbol() in the livepatch module. Based on the
> test results, the performance overhead is reduced to 5%. That is, the
> performance of these functions is improved by 20 times.
> 
> To avoid increasing the kernel size in non-debug mode, the optimization is only
> for the case CONFIG_KALLSYMS_ALL=y.

WIthout having time yet to reveiw the implementation details, it would
seem this is an area we may want to test for future improvements easily,
so a selftest better yet a kunit test may be nice for this. Can you
write one so we can easily gather a simple metric for "how long does
this take"?

  Luis

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols
  2022-09-09  0:07 ` [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Luis Chamberlain
@ 2022-09-09  1:17   ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 10+ messages in thread
From: Leizhen (ThunderTown) @ 2022-09-09  1:17 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	linux-modules



On 2022/9/9 8:07, Luis Chamberlain wrote:
> On Thu, Sep 08, 2022 at 09:09:29PM +0800, Zhen Lei wrote:
>> Currently, to search for a symbol, we need to expand the symbols in
>> 'kallsyms_names' one by one, and then use the expanded string for
>> comparison. This is very slow.
>>
>> In fact, we can first compress the name being looked up and then use
>> it for comparison when traversing 'kallsyms_names'.
>>
>> This patch series optimizes the performance of function kallsyms_lookup_name(),
>> and function klp_find_object_symbol() in the livepatch module. Based on the
>> test results, the performance overhead is reduced to 5%. That is, the
>> performance of these functions is improved by 20 times.
>>
>> To avoid increasing the kernel size in non-debug mode, the optimization is only
>> for the case CONFIG_KALLSYMS_ALL=y.
> 
> WIthout having time yet to reveiw the implementation details, it would
> seem this is an area we may want to test for future improvements easily,
> so a selftest better yet a kunit test may be nice for this. Can you
> write one so we can easily gather a simple metric for "how long does
> this take"?

Good advice. I'll write it today.

> 
>   Luis
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-09-09  1:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08 13:09 [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
2022-09-08 13:09 ` [PATCH 1/7] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
2022-09-08 13:09 ` [PATCH 2/7] scripts/kallsyms: rename build_initial_tok_table() Zhen Lei
2022-09-08 13:09 ` [PATCH 3/7] kallsyms: Adjust the types of some local variables Zhen Lei
2022-09-08 13:09 ` [PATCH 4/7] kallsyms: Improve the performance of kallsyms_lookup_name() Zhen Lei
2022-09-08 13:09 ` [PATCH 5/7] kallsyms: Add helper kallsyms_on_each_match_symbol() Zhen Lei
2022-09-08 13:09 ` [PATCH 6/7] livepatch: Use kallsyms_on_each_match_symbol() to improve performance Zhen Lei
2022-09-08 13:09 ` [PATCH 7/7] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
2022-09-09  0:07 ` [PATCH 0/7] kallsyms: Optimizes the performance of lookup symbols Luis Chamberlain
2022-09-09  1:17   ` Leizhen (ThunderTown)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).