linux-modules.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols
@ 2022-09-09 13:00 Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

v1 --> v2:
Add self-test facility

v1:
Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This is very slow.

In fact, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names'.

This patch series optimizes the performance of function kallsyms_lookup_name(),
and function klp_find_object_symbol() in the livepatch module. Based on the
test results, the performance overhead is reduced to 5%. That is, the
performance of these functions is improved by 20 times.

To avoid increasing the kernel size in non-debug mode, the optimization is only
for the case CONFIG_KALLSYMS_ALL=y.

Zhen Lei (8):
  scripts/kallsyms: don't compress symbol type when
    CONFIG_KALLSYMS_ALL=y
  scripts/kallsyms: rename build_initial_tok_table()
  kallsyms: Adjust the types of some local variables
  kallsyms: Improve the performance of kallsyms_lookup_name()
  kallsyms: Add helper kallsyms_on_each_match_symbol()
  livepatch: Use kallsyms_on_each_match_symbol() to improve performance
  livepatch: Improve the search performance of
    module_kallsyms_on_each_symbol()
  kallsyms: Add self-test facility

 include/linux/kallsyms.h   |   8 ++
 init/Kconfig               |  13 ++
 kernel/Makefile            |   1 +
 kernel/kallsyms.c          | 135 ++++++++++++++++++++-
 kernel/kallsyms_selftest.c | 243 +++++++++++++++++++++++++++++++++++++
 kernel/livepatch/core.c    |  25 +++-
 kernel/module/kallsyms.c   |  13 +-
 scripts/kallsyms.c         |  19 ++-
 8 files changed, 441 insertions(+), 16 deletions(-)
 create mode 100644 kernel/kallsyms_selftest.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-20 17:26   ` Petr Mladek
  2022-09-09 13:00 ` [PATCH v2 2/8] scripts/kallsyms: rename build_initial_tok_table() Zhen Lei
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This is very slow.

In fact, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names'.

This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86
with defconfig.
Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659
After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418

However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names'
does not change.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 scripts/kallsyms.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index f18e6dfc68c5839..ab6fe7cd014efd1 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -60,6 +60,7 @@ static unsigned int table_size, table_cnt;
 static int all_symbols;
 static int absolute_percpu;
 static int base_relative;
+static int sym_start_idx;
 
 static int token_profit[0x10000];
 
@@ -511,7 +512,7 @@ static void learn_symbol(const unsigned char *symbol, int len)
 {
 	int i;
 
-	for (i = 0; i < len - 1; i++)
+	for (i = sym_start_idx; i < len - 1; i++)
 		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]++;
 }
 
@@ -520,7 +521,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
 {
 	int i;
 
-	for (i = 0; i < len - 1; i++)
+	for (i = sym_start_idx; i < len - 1; i++)
 		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]--;
 }
 
@@ -538,7 +539,7 @@ static unsigned char *find_token(unsigned char *str, int len,
 {
 	int i;
 
-	for (i = 0; i < len - 1; i++) {
+	for (i = sym_start_idx; i < len - 1; i++) {
 		if (str[i] == token[0] && str[i+1] == token[1])
 			return &str[i];
 	}
@@ -780,6 +781,14 @@ int main(int argc, char **argv)
 	} else if (argc != 1)
 		usage();
 
+	/*
+	 * Skip the symbol type, do not compress it to optimize the performance
+	 * of finding or traversing symbols in kernel, this is good for modules
+	 * such as livepatch.
+	 */
+	if (all_symbols)
+		sym_start_idx = 1;
+
 	read_map(stdin);
 	shrink_table();
 	if (absolute_percpu)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/8] scripts/kallsyms: rename build_initial_tok_table()
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 3/8] kallsyms: Adjust the types of some local variables Zhen Lei
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Except for the function build_initial_tok_table(), no token abbreviation
is used elsewhere.

$ cat scripts/kallsyms.c | grep tok | wc -l
33
$ cat scripts/kallsyms.c | grep token | wc -l
31

Here, it would be clearer to use the full name.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 scripts/kallsyms.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index ab6fe7cd014efd1..678ebe7d4c1cc38 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -526,7 +526,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
 }
 
 /* do the initial token count */
-static void build_initial_tok_table(void)
+static void build_initial_token_table(void)
 {
 	unsigned int i;
 
@@ -651,7 +651,7 @@ static void insert_real_symbols_in_table(void)
 
 static void optimize_token_table(void)
 {
-	build_initial_tok_table();
+	build_initial_token_table();
 
 	insert_real_symbols_in_table();
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/8] kallsyms: Adjust the types of some local variables
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 2/8] scripts/kallsyms: rename build_initial_tok_table() Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 4/8] kallsyms: Improve the performance of kallsyms_lookup_name() Zhen Lei
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

The type of kallsyms_num_syms is 'unsigned int', adjust the type of
local variables associated with it for indexing, so that their types
are consistent.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/kallsyms.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 3e7e2c2ad2f75ef..9dd4774b6c6edf6 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -190,8 +190,7 @@ static bool cleanup_symbol_name(char *s)
 unsigned long kallsyms_lookup_name(const char *name)
 {
 	char namebuf[KSYM_NAME_LEN];
-	unsigned long i;
-	unsigned int off;
+	unsigned int i, off;
 
 	/* Skip the search for empty string. */
 	if (!*name)
@@ -218,8 +217,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 			    void *data)
 {
 	char namebuf[KSYM_NAME_LEN];
-	unsigned long i;
-	unsigned int off;
+	unsigned int i, off;
 	int ret;
 
 	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
@@ -237,7 +235,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
 				    unsigned long *offset)
 {
 	unsigned long symbol_start = 0, symbol_end = 0;
-	unsigned long i, low, high, mid;
+	unsigned int i, low, high, mid;
 
 	/* This kernel should never had been booted. */
 	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 4/8] kallsyms: Improve the performance of kallsyms_lookup_name()
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (2 preceding siblings ...)
  2022-09-09 13:00 ` [PATCH v2 3/8] kallsyms: Adjust the types of some local variables Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 5/8] kallsyms: Add helper kallsyms_on_each_match_symbol() Zhen Lei
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This is very slow.

In fact, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names'.

This requires CONFIG_KALLSYMS_ALL=y.

The pseudo code of the test case is as follows:
static int stat_find_name(...)
{
	start = sched_clock();
	(void)kallsyms_lookup_name(name);
	end = sched_clock();
	//Update min, max, cnt, sum
}

/*
 * Traverse all symbols in sequence and collect statistics on the time
 * taken by kallsyms_lookup_name() to lookup each symbol.
 */
kallsyms_on_each_symbol(stat_find_name, NULL);

The test results are as follows (twice):
After : min=7106, max=  564822, cnt=131392, avg= 247965
After : min=6971, max=  557676, cnt=131393, avg= 248350
Before: min= 682, max=23045734, cnt=131392, avg=6966802
Before: min= 647, max=17676731, cnt=131392, avg=6965314

The average time consumed is only 3.56% and the maximum time consumed is
only 2.76% of the time consumed before optimization.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/kallsyms.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 9dd4774b6c6edf6..e1cd7305aa5f548 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -87,6 +87,72 @@ static unsigned int kallsyms_expand_symbol(unsigned int off,
 	return off;
 }
 
+static int kallsyms_name_to_tokens(const char *name, char *buf)
+{
+	int i, j, k, n;
+	int len, token_len;
+	const char *token;
+	unsigned char token_idx[KSYM_NAME_LEN];
+	unsigned char token_bak[KSYM_NAME_LEN];
+
+	if (!IS_ENABLED(CONFIG_KALLSYMS_ALL))
+		return 0;
+
+	/*
+	 * n, number of tokens in the string name.
+	 * token_idx[i], the start index of the ith token.
+	 * token_idx[n] is used to calculate the length of the last token.
+	 */
+	n = strlen(name);
+	if (n >= KSYM_NAME_LEN)
+		return 0;
+	for (i = 0; i <= n; i++)
+		token_idx[i] = (unsigned char)i;
+
+	/*
+	 * For tokens whose token_len >= 2, a larger index value indicates
+	 * a higher occurrence frequency. See scripts/kallsyms.c
+	 */
+	for (i = 255; i >= 0; i--) {
+		token = &kallsyms_token_table[kallsyms_token_index[i]];
+		token_len = strlen(token);
+		if (token_len <= 1)
+			continue;
+
+		/*
+		 * Find and merge two tokens into one.
+		 *
+		 *                |<-- new_token -->|
+		 *                | token1 | token2 |
+		 * token_idx[]:   j       j+1      j+2
+		 *
+		 */
+		for (j = 0; j < n - 1; j++) {
+			len = token_idx[j + 2] - token_idx[j];
+			if (len == token_len &&
+			    !strncmp(name + token_idx[j], token, len)) {
+				token_bak[token_idx[j]] = (unsigned char)i;
+				for (k = j + 1; k < n; k++)
+					token_idx[k] = token_idx[k + 1];
+				n--;
+			}
+		}
+	}
+
+	for (j = 0; j < n; j++) {
+		len = token_idx[j + 1] - token_idx[j];
+		if (len <= 1) {
+			buf[j] = name[token_idx[j]];
+			continue;
+		}
+
+		buf[j] = token_bak[token_idx[j]];
+	}
+	buf[n] = 0;
+
+	return n;
+}
+
 /*
  * Get symbol type information. This is encoded as a single char at the
  * beginning of the symbol name.
@@ -191,11 +257,29 @@ unsigned long kallsyms_lookup_name(const char *name)
 {
 	char namebuf[KSYM_NAME_LEN];
 	unsigned int i, off;
+	int len;
 
 	/* Skip the search for empty string. */
 	if (!*name)
 		return 0;
 
+	len = kallsyms_name_to_tokens(name, namebuf);
+	if (!len)
+		goto slow_path;
+
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		if (kallsyms_names[off] == len + 1 &&
+		    !memcmp(&kallsyms_names[off + 2], namebuf, len)) {
+			return kallsyms_sym_address(i);
+		}
+
+		off += kallsyms_names[off] + 1;
+	}
+
+	if (!IS_ENABLED(CONFIG_LTO_CLANG))
+		goto module_lookup;
+
+slow_path:
 	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
 
@@ -205,6 +289,8 @@ unsigned long kallsyms_lookup_name(const char *name)
 		if (cleanup_symbol_name(namebuf) && strcmp(namebuf, name) == 0)
 			return kallsyms_sym_address(i);
 	}
+
+module_lookup:
 	return module_kallsyms_lookup_name(name);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 5/8] kallsyms: Add helper kallsyms_on_each_match_symbol()
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (3 preceding siblings ...)
  2022-09-09 13:00 ` [PATCH v2 4/8] kallsyms: Improve the performance of kallsyms_lookup_name() Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 6/8] livepatch: Use kallsyms_on_each_match_symbol() to improve performance Zhen Lei
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Function kallsyms_on_each_symbol() traverses all symbols and submits each
symbol to the hook 'fn' for judgment and processing. For some cases, the
hook actually only handles the matched symbol, such as livepatch.

So that, we can first compress the name being looked up and then use
it for comparison when traversing 'kallsyms_names', this greatly reduces
the time consumed by traversing.

This requires CONFIG_KALLSYMS_ALL=y, so that scripts/kallsyms does not
compress that type character of each symbol.

If CONFIG_KALLSYMS_ALL=n, the traversal of symbols is rolled back to the
mode before optimization.

The pseudo code of the test case is as follows:
static int tst_find(void *data, const char *name,
		    struct module *mod, unsigned long addr)
{
	if (strcmp(name, "vmap") == 0)
		*(unsigned long *)data = addr;
        return 0;
}

static int tst_match(void *data, unsigned long addr)
{
        *(unsigned long *)data = addr;
        return 0;
}

start = sched_clock();
kallsyms_on_each_match_symbol(tst_match, "vmap", &addr);
end = sched_clock();

start = sched_clock();
kallsyms_on_each_symbol(tst_find, &addr);
end = sched_clock();

The test results are as follows (twice):
kallsyms_on_each_match_symbol:  1058511,  1079288
kallsyms_on_each_symbol      : 26097313, 24765180

kallsyms_on_each_match_symbol() consumes only 4.2% of
kallsyms_on_each_symbol()'s time.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 include/linux/kallsyms.h |  8 ++++++++
 kernel/kallsyms.c        | 41 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index ad39636e0c3f122..f9f2cc084cab16b 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -69,6 +69,8 @@ static inline void *dereference_symbol_descriptor(void *ptr)
 int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 				      unsigned long),
 			    void *data);
+extern int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+					 const char *name, void *data);
 
 /* Lookup the address for a symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name);
@@ -168,6 +170,12 @@ static inline int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct
 {
 	return -EOPNOTSUPP;
 }
+
+static inline int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+						const char *name, void *data)
+{
+	return -EOPNOTSUPP;
+}
 #endif /*CONFIG_KALLSYMS*/
 
 static inline void print_ip_sym(const char *loglvl, unsigned long ip)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index e1cd7305aa5f548..9816a0ac30c8c48 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -316,6 +316,47 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 	return 0;
 }
 
+int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
+				  const char *name, void *data)
+{
+	unsigned int i, off;
+	int len, ret;
+	char namebuf[KSYM_NAME_LEN];
+
+	len = kallsyms_name_to_tokens(name, namebuf);
+	if (!len)
+		goto slow_path;
+
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		if ((i & 0xfff) == 0)
+			cond_resched();
+
+		if ((kallsyms_names[off] == len + 1) &&
+		    !memcmp(&kallsyms_names[off + 2], namebuf, len)) {
+			ret = fn(data, kallsyms_sym_address(i));
+			if (ret != 0)
+				return ret;
+			cond_resched();
+		}
+		off += kallsyms_names[off] + 1;
+	}
+
+	return 0;
+
+slow_path:
+	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
+		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
+		if (!strcmp(name, namebuf)) {
+			ret = fn(data, kallsyms_sym_address(i));
+			if (ret != 0)
+				return ret;
+		}
+		cond_resched();
+	}
+
+	return 0;
+}
+
 static unsigned long get_symbol_pos(unsigned long addr,
 				    unsigned long *symbolsize,
 				    unsigned long *offset)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 6/8] livepatch: Use kallsyms_on_each_match_symbol() to improve performance
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (4 preceding siblings ...)
  2022-09-09 13:00 ` [PATCH v2 5/8] kallsyms: Add helper kallsyms_on_each_match_symbol() Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-09 13:00 ` [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Based on the test results of kallsyms_on_each_match_symbol() and
kallsyms_on_each_symbol(), this can reduce the performance overhead by
approximately 95%, if CONFIG_KALLSYMS_ALL=y.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/livepatch/core.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 42f7e716d56bf72..31b57ccf908017e 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -153,6 +153,24 @@ static int klp_find_callback(void *data, const char *name,
 	return 0;
 }
 
+static int klp_match_callback(void *data, unsigned long addr)
+{
+	struct klp_find_arg *args = data;
+
+	args->addr = addr;
+	args->count++;
+
+	/*
+	 * Finish the search when the symbol is found for the desired position
+	 * or the position is not defined for a non-unique symbol.
+	 */
+	if ((args->pos && (args->count == args->pos)) ||
+	    (!args->pos && (args->count > 1)))
+		return 1;
+
+	return 0;
+}
+
 static int klp_find_object_symbol(const char *objname, const char *name,
 				  unsigned long sympos, unsigned long *addr)
 {
@@ -167,7 +185,7 @@ static int klp_find_object_symbol(const char *objname, const char *name,
 	if (objname)
 		module_kallsyms_on_each_symbol(klp_find_callback, &args);
 	else
-		kallsyms_on_each_symbol(klp_find_callback, &args);
+		kallsyms_on_each_match_symbol(klp_match_callback, name, &args);
 
 	/*
 	 * Ensure an address was found. If sympos is 0, ensure symbol is unique;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol()
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (5 preceding siblings ...)
  2022-09-09 13:00 ` [PATCH v2 6/8] livepatch: Use kallsyms_on_each_match_symbol() to improve performance Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-20 12:08   ` Petr Mladek
  2022-09-09 13:00 ` [PATCH v2 8/8] kallsyms: Add self-test facility Zhen Lei
  2022-09-16  1:27 ` [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Leizhen (ThunderTown)
  8 siblings, 1 reply; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Currently we traverse all symbols of all modules to find the specified
function for the specified module. But in reality, we just need to find
the given module and then traverse all the symbols in it.

In order to achieve this purpose, split the call to hook 'fn' into two
phases:
1. Finds the given module. Pass pointer 'mod'. Hook 'fn' directly returns
   the comparison result of the module name without comparing the function
   name.
2. Finds the given function in that module. Pass pointer 'mod = NULL'.
   Hook 'fn' skip the comparison of module name and directly compare
   function names.

Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
                |
Phase2:          -->f1-->f2-->f3

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/livepatch/core.c  |  7 ++-----
 kernel/module/kallsyms.c | 13 ++++++++++++-
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 31b57ccf908017e..98e23137e4133bc 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -130,15 +130,12 @@ static int klp_find_callback(void *data, const char *name,
 {
 	struct klp_find_arg *args = data;
 
-	if ((mod && !args->objname) || (!mod && args->objname))
-		return 0;
+	if (mod)
+		return strcmp(args->objname, mod->name);
 
 	if (strcmp(args->name, name))
 		return 0;
 
-	if (args->objname && strcmp(args->objname, mod->name))
-		return 0;
-
 	args->addr = addr;
 	args->count++;
 
diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
index f5c5c9175333df7..b033613e6c7e3bb 100644
--- a/kernel/module/kallsyms.c
+++ b/kernel/module/kallsyms.c
@@ -510,6 +510,11 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
 		if (mod->state == MODULE_STATE_UNFORMED)
 			continue;
 
+		/* check mod->name first */
+		ret = fn(data, NULL, mod, 0);
+		if (ret)
+			continue;
+
 		/* Use rcu_dereference_sched() to remain compliant with the sparse tool */
 		preempt_disable();
 		kallsyms = rcu_dereference_sched(mod->kallsyms);
@@ -522,10 +527,16 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
 				continue;
 
 			ret = fn(data, kallsyms_symbol_name(kallsyms, i),
-				 mod, kallsyms_symbol_value(sym));
+				 NULL, kallsyms_symbol_value(sym));
 			if (ret != 0)
 				goto out;
 		}
+
+		/*
+		 * The given module is found, the subsequent modules do not
+		 * need to be compared.
+		 */
+		break;
 	}
 out:
 	mutex_unlock(&module_mutex);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 8/8] kallsyms: Add self-test facility
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (6 preceding siblings ...)
  2022-09-09 13:00 ` [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
@ 2022-09-09 13:00 ` Zhen Lei
  2022-09-17  8:07   ` Kees Cook
  2022-09-16  1:27 ` [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Leizhen (ThunderTown)
  8 siblings, 1 reply; 20+ messages in thread
From: Zhen Lei @ 2022-09-09 13:00 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules
  Cc: Zhen Lei

Add some test cases to test the function and performance of some kallsyms
interfaces, such as kallsyms_lookup_name. It also calculates the
compression rate of the kallsyms compression algorithm for the current
symbol set.

Start self-test automatically after system startup.

Example of output content: (prefix 'kallsyms_selftest:' is omitted)
start
There are 174101 symbols in total:
 --------------------------------------------------------------
|           |  compressed size  |  original size  |  ratio(%)  |
|--------------------------------------------------------------|
|  no  '\0' |        1785569    |       3750649   |   47.60    |
| with '\0' |        1959670    |       3924750   |   49.93    |
 --------------------------------------------------------------

kallsyms_lookup_name() looked up 174101 symbols
The time spent on each symbol is (ns): min=5350, max=985150, avg=295517
kallsyms_on_each_symbol() lookup vmap: 15806120 ns
kallsyms_on_each_symbol() traverse vmap: 15817140 ns
kallsyms_on_each_match_symbol() lookup vmap: 32840 ns
kallsyms_on_each_match_symbol() traverse vmap: 567370 ns

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 init/Kconfig               |  13 ++
 kernel/Makefile            |   1 +
 kernel/kallsyms_selftest.c | 243 +++++++++++++++++++++++++++++++++++++
 3 files changed, 257 insertions(+)
 create mode 100644 kernel/kallsyms_selftest.c

diff --git a/init/Kconfig b/init/Kconfig
index 532362fcfe31fd3..2fcace3b9f063bf 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1716,6 +1716,19 @@ config KALLSYMS
 	  symbolic stack backtraces. This increases the size of the kernel
 	  somewhat, as all symbols have to be loaded into the kernel image.
 
+config KALLSYMS_SELFTEST
+	bool "Test the function and performance of some interfaces in kallsyms"
+	depends on KALLSYMS
+	default n
+	help
+	  Test the function and performance of some interfaces, such as
+	  kallsyms_lookup_name. It also calculates the compression rate of the
+	  kallsyms compression algorithm for the current symbol set.
+
+	  Start self-test automatically after system startup. Suggest executing
+	  "dmesg | grep kallsyms_selftest" to collect test results. "finish" is
+	  displayed in the last line, indicating that the test is complete.
+
 config KALLSYMS_ALL
 	bool "Include all symbols in kallsyms"
 	depends on DEBUG_KERNEL && KALLSYMS
diff --git a/kernel/Makefile b/kernel/Makefile
index 318789c728d3290..122a5fed457bd98 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -68,6 +68,7 @@ endif
 obj-$(CONFIG_UID16) += uid16.o
 obj-$(CONFIG_MODULE_SIG_FORMAT) += module_signature.o
 obj-$(CONFIG_KALLSYMS) += kallsyms.o
+obj-$(CONFIG_KALLSYMS_SELFTEST) += kallsyms_selftest.o
 obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
 obj-$(CONFIG_CRASH_CORE) += crash_core.o
 obj-$(CONFIG_KEXEC_CORE) += kexec_core.o
diff --git a/kernel/kallsyms_selftest.c b/kernel/kallsyms_selftest.c
new file mode 100644
index 000000000000000..d89d6b2f5dd763e
--- /dev/null
+++ b/kernel/kallsyms_selftest.c
@@ -0,0 +1,243 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Test the function and performance of kallsyms
+ *
+ * Copyright (C) Huawei Technologies Co., Ltd., 2022
+ *
+ * Authors: Zhen Lei <thunder.leizhen@huawei.com> Huawei
+ */
+
+#define pr_fmt(fmt) "kallsyms_selftest: " fmt
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kallsyms.h>
+#include <linux/kthread.h>
+#include <linux/sched/clock.h>
+#include <linux/vmalloc.h>
+
+#include "kallsyms_internal.h"
+
+
+struct test_stat {
+	int min;
+	int max;
+	int cnt;
+	u64 sum;
+	unsigned long addr;
+};
+
+
+static int match_symbol(void *data, unsigned long addr)
+{
+	struct test_stat *stat = (struct test_stat *)data;
+
+	stat->cnt++;
+	stat->addr = addr;
+
+	if (stat->cnt == stat->max)
+		return 1;
+
+	return 0;
+}
+
+static void test_kallsyms_on_each_match_symbol(void)
+{
+	u64 t0, t1;
+	unsigned long flags;
+	struct test_stat stat;
+
+	stat.cnt = 0;
+	stat.max = 1;
+	stat.addr = 0;
+	local_irq_save(flags);
+	t0 = sched_clock();
+	kallsyms_on_each_match_symbol(match_symbol, "vmap", &stat);
+	t1 = sched_clock();
+	local_irq_restore(flags);
+	if (stat.addr != (unsigned long)vmap || stat.cnt != 1) {
+		pr_info("kallsyms_on_each_match_symbol() test failed\n");
+		return;
+	}
+	pr_info("kallsyms_on_each_match_symbol() lookup vmap: %lld ns\n", t1 - t0);
+
+	stat.cnt = 0;
+	stat.max = INT_MAX;
+	stat.addr = 0;
+	local_irq_save(flags);
+	t0 = sched_clock();
+	kallsyms_on_each_match_symbol(match_symbol, "vmap", &stat);
+	t1 = sched_clock();
+	local_irq_restore(flags);
+	if (stat.addr != (unsigned long)vmap || stat.cnt != 1) {
+		pr_info("kallsyms_on_each_match_symbol() test failed\n");
+		return;
+	}
+	pr_info("kallsyms_on_each_match_symbol() traverse vmap: %lld ns\n", t1 - t0);
+}
+
+static int find_symbol(void *data, const char *name,
+		       struct module *mod, unsigned long addr)
+{
+	struct test_stat *stat = (struct test_stat *)data;
+
+	if (strcmp(name, "vmap") == 0) {
+		stat->cnt++;
+		stat->addr = addr;
+	}
+
+	if (stat->cnt == stat->max)
+		return 1;
+
+	return 0;
+}
+
+static void test_kallsyms_on_each_symbol(void)
+{
+	u64 t0, t1;
+	unsigned long flags;
+	struct test_stat stat;
+
+	stat.cnt = 0;
+	stat.sum = 1;
+	stat.addr = 0;
+	local_irq_save(flags);
+	t0 = sched_clock();
+	kallsyms_on_each_symbol(find_symbol, &stat);
+	t1 = sched_clock();
+	local_irq_restore(flags);
+	if (stat.addr != (unsigned long)vmap || stat.cnt != 1) {
+		pr_info("kallsyms_on_each_symbol() test failed\n");
+		return;
+	}
+	pr_info("kallsyms_on_each_symbol() lookup vmap: %lld ns\n", t1 - t0);
+
+	stat.cnt = 0;
+	stat.max = INT_MAX;
+	stat.addr = 0;
+	local_irq_save(flags);
+	t0 = sched_clock();
+	kallsyms_on_each_symbol(find_symbol, &stat);
+	t1 = sched_clock();
+	local_irq_restore(flags);
+	if (stat.addr != (unsigned long)vmap || stat.cnt != 1) {
+		pr_info("kallsyms_on_each_symbol() test failed\n");
+		return;
+	}
+	pr_info("kallsyms_on_each_symbol() traverse vmap: %lld ns\n", t1 - t0);
+}
+
+static int lookup_name(void *data, const char *name, struct module *mod, unsigned long addr)
+{
+	u64 t0, t1, t;
+	unsigned long flags;
+	struct test_stat *stat = (struct test_stat *)data;
+
+	local_irq_save(flags);
+	t0 = sched_clock();
+	(void)kallsyms_lookup_name(name);
+	t1 = sched_clock();
+	local_irq_restore(flags);
+
+	t = t1 - t0;
+	if (t < stat->min)
+		stat->min = t;
+
+	if (t > stat->max)
+		stat->max = t;
+
+	stat->cnt++;
+	stat->sum += t;
+
+	return 0;
+}
+
+static void test_kallsyms_lookup_name(void)
+{
+	struct test_stat stat;
+
+	stat.min = INT_MAX;
+	stat.max = 0;
+	stat.cnt = 0;
+	stat.sum = 0;
+	kallsyms_on_each_symbol(lookup_name, &stat);
+	pr_info("kallsyms_lookup_name() looked up %d symbols\n", stat.cnt);
+	pr_info("The time spent on each symbol is (ns): min=%d, max=%d, avg=%lld\n",
+		stat.min, stat.max, stat.sum / stat.cnt);
+
+	stat.addr = kallsyms_lookup_name("vmap");
+	if (stat.addr != (unsigned long)vmap)
+		pr_info("kallsyms_lookup_name() test failed\n");
+}
+
+static int stat_symbol_len(void *data, const char *name,
+			   struct module *mod, unsigned long addr)
+{
+	*(u32 *)data += strlen(name);
+
+	return 0;
+}
+
+static void test_kallsyms_compression_ratio(void)
+{
+	int i;
+	const u8 *name;
+	u32 pos;
+	u32 ratio, total_size, total_len = 0;
+
+	kallsyms_on_each_symbol(stat_symbol_len, &total_len);
+
+	pos = kallsyms_num_syms - 1;
+	name = &kallsyms_names[kallsyms_markers[pos >> 8]];
+	for (i = 0; i <= (pos & 0xff); i++)
+		name = name + (*name) + 1;
+
+	/* The length and string terminator are not counted */
+	total_size = (name - kallsyms_names) - (kallsyms_num_syms * 2);
+	pr_info("There are %d symbols in total:\n", kallsyms_num_syms);
+	pr_info(" --------------------------------------------------------------\n");
+	pr_info("|           |  compressed size  |  original size  |  ratio(%%)  |\n");
+	pr_info("|--------------------------------------------------------------|\n");
+	ratio = 10000ULL * total_size / total_len;
+	pr_info("|  no  '\\0' |     %10d    |    %10d   |   %2d.%-2d    |\n",
+			total_size, total_len, ratio / 100, ratio % 100);
+	total_size += kallsyms_num_syms;
+	total_len  += kallsyms_num_syms;
+	ratio = 10000ULL * total_size / total_len;
+	pr_info("| with '\\0' |     %10d    |    %10d   |   %2d.%-2d    |\n",
+			total_size, total_len, ratio / 100, ratio % 100);
+	pr_info(" --------------------------------------------------------------\n");
+	pr_info("\n");
+}
+
+static int test_entry(void *p)
+{
+	do {
+		schedule_timeout(5 * HZ);
+	} while (system_state != SYSTEM_RUNNING);
+
+	pr_info("start\n");
+	test_kallsyms_compression_ratio();
+	test_kallsyms_lookup_name();
+	test_kallsyms_on_each_symbol();
+	test_kallsyms_on_each_match_symbol();
+	pr_info("finish\n");
+
+	return 0;
+}
+
+static int __init kallsyms_test_init(void)
+{
+	struct task_struct *t;
+
+	t = kthread_create(test_entry, NULL, "kallsyms_test");
+	if (IS_ERR(t)) {
+		pr_info("Create kallsyms selftest task failed\n");
+		return PTR_ERR(t);
+	}
+	kthread_bind(t, 0);
+	wake_up_process(t);
+
+	return 0;
+}
+late_initcall(kallsyms_test_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols
  2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
                   ` (7 preceding siblings ...)
  2022-09-09 13:00 ` [PATCH v2 8/8] kallsyms: Add self-test facility Zhen Lei
@ 2022-09-16  1:27 ` Leizhen (ThunderTown)
  2022-09-16  3:17   ` Leizhen (ThunderTown)
  8 siblings, 1 reply; 20+ messages in thread
From: Leizhen (ThunderTown) @ 2022-09-16  1:27 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules

Hi, everybody:
  Can anyone review it? Maybe I should split this patch series
into two parts: kallsyms and livepatch.
  In fact, the performance can be improved even if the compression policy
of the symbol type is not changed, that is, the scripts/callsyms.c file is
not modified, but we perform the len-based filtering first. That way, it'll
be easier for everyone to review. OK, I'm ready for v3.


On 2022/9/9 21:00, Zhen Lei wrote:
> v1 --> v2:
> Add self-test facility
> 
> v1:
> Currently, to search for a symbol, we need to expand the symbols in
> 'kallsyms_names' one by one, and then use the expanded string for
> comparison. This is very slow.
> 
> In fact, we can first compress the name being looked up and then use
> it for comparison when traversing 'kallsyms_names'.
> 
> This patch series optimizes the performance of function kallsyms_lookup_name(),
> and function klp_find_object_symbol() in the livepatch module. Based on the
> test results, the performance overhead is reduced to 5%. That is, the
> performance of these functions is improved by 20 times.
> 
> To avoid increasing the kernel size in non-debug mode, the optimization is only
> for the case CONFIG_KALLSYMS_ALL=y.
> 
> Zhen Lei (8):
>   scripts/kallsyms: don't compress symbol type when
>     CONFIG_KALLSYMS_ALL=y
>   scripts/kallsyms: rename build_initial_tok_table()
>   kallsyms: Adjust the types of some local variables
>   kallsyms: Improve the performance of kallsyms_lookup_name()
>   kallsyms: Add helper kallsyms_on_each_match_symbol()
>   livepatch: Use kallsyms_on_each_match_symbol() to improve performance
>   livepatch: Improve the search performance of
>     module_kallsyms_on_each_symbol()
>   kallsyms: Add self-test facility
> 
>  include/linux/kallsyms.h   |   8 ++
>  init/Kconfig               |  13 ++
>  kernel/Makefile            |   1 +
>  kernel/kallsyms.c          | 135 ++++++++++++++++++++-
>  kernel/kallsyms_selftest.c | 243 +++++++++++++++++++++++++++++++++++++
>  kernel/livepatch/core.c    |  25 +++-
>  kernel/module/kallsyms.c   |  13 +-
>  scripts/kallsyms.c         |  19 ++-
>  8 files changed, 441 insertions(+), 16 deletions(-)
>  create mode 100644 kernel/kallsyms_selftest.c
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols
  2022-09-16  1:27 ` [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Leizhen (ThunderTown)
@ 2022-09-16  3:17   ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 20+ messages in thread
From: Leizhen (ThunderTown) @ 2022-09-16  3:17 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, live-patching, linux-kernel, Masahiro Yamada,
	Alexei Starovoitov, Jiri Olsa, Kees Cook, Andrew Morton,
	Luis Chamberlain, linux-modules



On 2022/9/16 9:27, Leizhen (ThunderTown) wrote:
> Hi, everybody:
>   Can anyone review it? Maybe I should split this patch series
> into two parts: kallsyms and livepatch.
>   In fact, the performance can be improved even if the compression policy
> of the symbol type is not changed, that is, the scripts/callsyms.c file is
> not modified, but we perform the len-based filtering first. That way, it'll
> be easier for everyone to review. OK, I'm ready for v3.

Sorry, I didn't sleep well yesterday. The length is the length of the
compressed string. Therefore, the solution is not feasible.

> 
> 
> On 2022/9/9 21:00, Zhen Lei wrote:
>> v1 --> v2:
>> Add self-test facility
>>
>> v1:
>> Currently, to search for a symbol, we need to expand the symbols in
>> 'kallsyms_names' one by one, and then use the expanded string for
>> comparison. This is very slow.
>>
>> In fact, we can first compress the name being looked up and then use
>> it for comparison when traversing 'kallsyms_names'.
>>
>> This patch series optimizes the performance of function kallsyms_lookup_name(),
>> and function klp_find_object_symbol() in the livepatch module. Based on the
>> test results, the performance overhead is reduced to 5%. That is, the
>> performance of these functions is improved by 20 times.
>>
>> To avoid increasing the kernel size in non-debug mode, the optimization is only
>> for the case CONFIG_KALLSYMS_ALL=y.
>>
>> Zhen Lei (8):
>>   scripts/kallsyms: don't compress symbol type when
>>     CONFIG_KALLSYMS_ALL=y
>>   scripts/kallsyms: rename build_initial_tok_table()
>>   kallsyms: Adjust the types of some local variables
>>   kallsyms: Improve the performance of kallsyms_lookup_name()
>>   kallsyms: Add helper kallsyms_on_each_match_symbol()
>>   livepatch: Use kallsyms_on_each_match_symbol() to improve performance
>>   livepatch: Improve the search performance of
>>     module_kallsyms_on_each_symbol()
>>   kallsyms: Add self-test facility
>>
>>  include/linux/kallsyms.h   |   8 ++
>>  init/Kconfig               |  13 ++
>>  kernel/Makefile            |   1 +
>>  kernel/kallsyms.c          | 135 ++++++++++++++++++++-
>>  kernel/kallsyms_selftest.c | 243 +++++++++++++++++++++++++++++++++++++
>>  kernel/livepatch/core.c    |  25 +++-
>>  kernel/module/kallsyms.c   |  13 +-
>>  scripts/kallsyms.c         |  19 ++-
>>  8 files changed, 441 insertions(+), 16 deletions(-)
>>  create mode 100644 kernel/kallsyms_selftest.c
>>
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 8/8] kallsyms: Add self-test facility
  2022-09-09 13:00 ` [PATCH v2 8/8] kallsyms: Add self-test facility Zhen Lei
@ 2022-09-17  8:07   ` Kees Cook
  2022-09-17 12:40     ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 20+ messages in thread
From: Kees Cook @ 2022-09-17  8:07 UTC (permalink / raw)
  To: Zhen Lei, Josh Poimboeuf, Jiri Kosina, Miroslav Benes,
	Petr Mladek, Joe Lawrence, live-patching, linux-kernel,
	Masahiro Yamada, Alexei Starovoitov, Jiri Olsa, Andrew Morton,
	Luis Chamberlain, linux-modules



On September 9, 2022 2:00:16 PM GMT+01:00, Zhen Lei <thunder.leizhen@huawei.com> wrote:
>Add some test cases to test the function and performance of some kallsyms
>interfaces, such as kallsyms_lookup_name. It also calculates the
>compression rate of the kallsyms compression algorithm for the current
>symbol set.
>
>Start self-test automatically after system startup.

I wonder if this would be better implemented as a kunit test? Shouldn't be too hard to convert. Take a look at things like lib/overflow_kunit.c, etc.

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 8/8] kallsyms: Add self-test facility
  2022-09-17  8:07   ` Kees Cook
@ 2022-09-17 12:40     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 20+ messages in thread
From: Leizhen (ThunderTown) @ 2022-09-17 12:40 UTC (permalink / raw)
  To: Kees Cook, Josh Poimboeuf, Jiri Kosina, Miroslav Benes,
	Petr Mladek, Joe Lawrence, live-patching, linux-kernel,
	Masahiro Yamada, Alexei Starovoitov, Jiri Olsa, Andrew Morton,
	Luis Chamberlain, linux-modules



On 2022/9/17 16:07, Kees Cook wrote:
> 
> 
> On September 9, 2022 2:00:16 PM GMT+01:00, Zhen Lei <thunder.leizhen@huawei.com> wrote:
>> Add some test cases to test the function and performance of some kallsyms
>> interfaces, such as kallsyms_lookup_name. It also calculates the
>> compression rate of the kallsyms compression algorithm for the current
>> symbol set.
>>
>> Start self-test automatically after system startup.

if CONFIG_KALLSYMS_SELFTEST=y

> 
> I wonder if this would be better implemented as a kunit test? Shouldn't be too hard to convert. Take a look at things like lib/overflow_kunit.c, etc.

Yes, I can try to define one for each type of symbol, such as: bss, data, weak, static.
In addition, we can use kallsyms_offsets[] to do a full test.

> 
> -Kees
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol()
  2022-09-09 13:00 ` [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
@ 2022-09-20 12:08   ` Petr Mladek
  2022-09-20 14:01     ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 20+ messages in thread
From: Petr Mladek @ 2022-09-20 12:08 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel, Masahiro Yamada, Alexei Starovoitov,
	Jiri Olsa, Kees Cook, Andrew Morton, Luis Chamberlain,
	linux-modules

On Fri 2022-09-09 21:00:15, Zhen Lei wrote:
> Currently we traverse all symbols of all modules to find the specified
> function for the specified module. But in reality, we just need to find
> the given module and then traverse all the symbols in it.

I agree that it might be noticeable speedup.

> In order to achieve this purpose, split the call to hook 'fn' into two
> phases:
> 1. Finds the given module. Pass pointer 'mod'. Hook 'fn' directly returns
>    the comparison result of the module name without comparing the function
>    name.
> 2. Finds the given function in that module. Pass pointer 'mod = NULL'.
>    Hook 'fn' skip the comparison of module name and directly compare
>    function names.
> 
> Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
>                 |
> Phase2:          -->f1-->f2-->f3
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  kernel/livepatch/core.c  |  7 ++-----
>  kernel/module/kallsyms.c | 13 ++++++++++++-
>  2 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 31b57ccf908017e..98e23137e4133bc 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -130,15 +130,12 @@ static int klp_find_callback(void *data, const char *name,
>  {
>  	struct klp_find_arg *args = data;
>  
> -	if ((mod && !args->objname) || (!mod && args->objname))
> -		return 0;
> +	if (mod)
> +		return strcmp(args->objname, mod->name);
>  
>  	if (strcmp(args->name, name))
>  		return 0;
>  
> -	if (args->objname && strcmp(args->objname, mod->name))
> -		return 0;
> -
>  	args->addr = addr;
>  	args->count++;
>  
> diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
> index f5c5c9175333df7..b033613e6c7e3bb 100644
> --- a/kernel/module/kallsyms.c
> +++ b/kernel/module/kallsyms.c
> @@ -510,6 +510,11 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
>  		if (mod->state == MODULE_STATE_UNFORMED)
>  			continue;
>  
> +		/* check mod->name first */
> +		ret = fn(data, NULL, mod, 0);
> +		if (ret)
> +			continue;

Hmm, it somehow gets too complicated. The same fn() callback has to
behave correctly in 3 different situations. I would suggest to
simplify everything:


1. Pass the requested modname as a parameter to module_kallsyms_on_each_symbol()

/*
 * Iterate over all symbols in the given @modname. For symbols from
 * vmlinux use kallsyms_on_each_symbol() instead.
 */
int module_kallsyms_on_each_symbol(const char *modname,
				   int (*fn)(void *, const char *,
					     struct module *, unsigned long),
				   void *data)

and do here:

		if (strcmp(modname, mod->name))
			continue;


2. We do not even need to pass .objname in struct klp_find_arg
   could simplify the callback:

static int klp_find_callback(void *data, const char *name,
			     struct module *mod, unsigned long addr)
{
	struct klp_find_arg *args = data;

	if (strcmp(args->name, name))
		return 0;

	args->addr = addr;
	args->count++;

	/*
	 * Finish the search when the symbol is found for the desired position
	 * or the position is not defined for a non-unique symbol.
	 */
	if ((args->pos && (args->count == args->pos)) ||
	    (!args->pos && (args->count > 1)))
		return 1;

	return 0;
}

3. As a result the *mod parameter won't be used by any existing
   fn() callback and could be removed. This should be done as
   a separate patch. It touches also ftrace_lookup_symbols().

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol()
  2022-09-20 12:08   ` Petr Mladek
@ 2022-09-20 14:01     ` Leizhen (ThunderTown)
  2022-09-21  6:56       ` Petr Mladek
  0 siblings, 1 reply; 20+ messages in thread
From: Leizhen (ThunderTown) @ 2022-09-20 14:01 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel, Masahiro Yamada, Alexei Starovoitov,
	Jiri Olsa, Kees Cook, Andrew Morton, Luis Chamberlain,
	linux-modules



On 2022/9/20 20:08, Petr Mladek wrote:
> On Fri 2022-09-09 21:00:15, Zhen Lei wrote:
>> Currently we traverse all symbols of all modules to find the specified
>> function for the specified module. But in reality, we just need to find
>> the given module and then traverse all the symbols in it.
> 
> I agree that it might be noticeable speedup.
> 
>> In order to achieve this purpose, split the call to hook 'fn' into two
>> phases:
>> 1. Finds the given module. Pass pointer 'mod'. Hook 'fn' directly returns
>>    the comparison result of the module name without comparing the function
>>    name.
>> 2. Finds the given function in that module. Pass pointer 'mod = NULL'.
>>    Hook 'fn' skip the comparison of module name and directly compare
>>    function names.
>>
>> Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
>>                 |
>> Phase2:          -->f1-->f2-->f3
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  kernel/livepatch/core.c  |  7 ++-----
>>  kernel/module/kallsyms.c | 13 ++++++++++++-
>>  2 files changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
>> index 31b57ccf908017e..98e23137e4133bc 100644
>> --- a/kernel/livepatch/core.c
>> +++ b/kernel/livepatch/core.c
>> @@ -130,15 +130,12 @@ static int klp_find_callback(void *data, const char *name,
>>  {
>>  	struct klp_find_arg *args = data;
>>  
>> -	if ((mod && !args->objname) || (!mod && args->objname))
>> -		return 0;
>> +	if (mod)
>> +		return strcmp(args->objname, mod->name);
>>  
>>  	if (strcmp(args->name, name))
>>  		return 0;
>>  
>> -	if (args->objname && strcmp(args->objname, mod->name))
>> -		return 0;
>> -
>>  	args->addr = addr;
>>  	args->count++;
>>  
>> diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
>> index f5c5c9175333df7..b033613e6c7e3bb 100644
>> --- a/kernel/module/kallsyms.c
>> +++ b/kernel/module/kallsyms.c
>> @@ -510,6 +510,11 @@ int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
>>  		if (mod->state == MODULE_STATE_UNFORMED)
>>  			continue;
>>  
>> +		/* check mod->name first */
>> +		ret = fn(data, NULL, mod, 0);
>> +		if (ret)
>> +			continue;
> 
> Hmm, it somehow gets too complicated. The same fn() callback has to
> behave correctly in 3 different situations. I would suggest to
> simplify everything:
> 
> 
> 1. Pass the requested modname as a parameter to module_kallsyms_on_each_symbol()
> 
> /*
>  * Iterate over all symbols in the given @modname. For symbols from
>  * vmlinux use kallsyms_on_each_symbol() instead.
>  */
> int module_kallsyms_on_each_symbol(const char *modname,
> 				   int (*fn)(void *, const char *,
> 					     struct module *, unsigned long),
> 				   void *data)
> 
> and do here:
> 
> 		if (strcmp(modname, mod->name))
> 			continue;

Right, looks good. This makes the code logic much clearer. Thanks.

> 
> 
> 2. We do not even need to pass .objname in struct klp_find_arg
>    could simplify the callback:
> 

Yes

> static int klp_find_callback(void *data, const char *name,
> 			     struct module *mod, unsigned long addr)
> {
> 	struct klp_find_arg *args = data;
> 
> 	if (strcmp(args->name, name))
> 		return 0;
> 
> 	args->addr = addr;
> 	args->count++;
> 
> 	/*
> 	 * Finish the search when the symbol is found for the desired position
> 	 * or the position is not defined for a non-unique symbol.
> 	 */
> 	if ((args->pos && (args->count == args->pos)) ||
> 	    (!args->pos && (args->count > 1)))
> 		return 1;
> 
> 	return 0;
> }
> 
> 3. As a result the *mod parameter won't be used by any existing
>    fn() callback and could be removed. This should be done as
>    a separate patch. It touches also ftrace_lookup_symbols().

OK, I will do it tomorrow. The next version is v5.

> 
> Best Regards,
> Petr
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y
  2022-09-09 13:00 ` [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
@ 2022-09-20 17:26   ` Petr Mladek
  2022-09-21  2:42     ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 20+ messages in thread
From: Petr Mladek @ 2022-09-20 17:26 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel, Masahiro Yamada, Alexei Starovoitov,
	Jiri Olsa, Kees Cook, Andrew Morton, Luis Chamberlain,
	linux-modules

On Fri 2022-09-09 21:00:09, Zhen Lei wrote:
> Currently, to search for a symbol, we need to expand the symbols in
> 'kallsyms_names' one by one, and then use the expanded string for
> comparison. This is very slow.
> 
> In fact, we can first compress the name being looked up and then use
> it for comparison when traversing 'kallsyms_names'.

This does not explain how this patch modifies the compressed data
and why it is needed.


> This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86
> with defconfig.
> Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659
> After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418
> 
> However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names'
> does not change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  scripts/kallsyms.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
> index f18e6dfc68c5839..ab6fe7cd014efd1 100644
> --- a/scripts/kallsyms.c
> +++ b/scripts/kallsyms.c
> @@ -60,6 +60,7 @@ static unsigned int table_size, table_cnt;
>  static int all_symbols;
>  static int absolute_percpu;
>  static int base_relative;
> +static int sym_start_idx;
>  
>  static int token_profit[0x10000];
>  
> @@ -511,7 +512,7 @@ static void learn_symbol(const unsigned char *symbol, int len)
>  {
>  	int i;
>  
> -	for (i = 0; i < len - 1; i++)
> +	for (i = sym_start_idx; i < len - 1; i++)
>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]++;

This skips the first character in the @symbol string. I do not see how
this is used in the new code, for example, in
kallsyms_on_each_match_symbol(), in the 5th patch. It seems to iterate
the compressed data from the 0th index:

	for (i = 0, off = 0; i < kallsyms_num_syms; i++)

>  }
>  
> @@ -520,7 +521,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
>  {
>  	int i;
>  
> -	for (i = 0; i < len - 1; i++)
> +	for (i = sym_start_idx; i < len - 1; i++)
>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]--;
>  }
>  
> @@ -538,7 +539,7 @@ static unsigned char *find_token(unsigned char *str, int len,
>  {
>  	int i;
>  
> -	for (i = 0; i < len - 1; i++) {
> +	for (i = sym_start_idx; i < len - 1; i++) {
>  		if (str[i] == token[0] && str[i+1] == token[1])
>  			return &str[i];
>  	}
> @@ -780,6 +781,14 @@ int main(int argc, char **argv)
>  	} else if (argc != 1)
>  		usage();
>  
> +	/*
> +	 * Skip the symbol type, do not compress it to optimize the performance
> +	 * of finding or traversing symbols in kernel, this is good for modules
> +	 * such as livepatch.

I see. The type is added as the first character here.

in static struct sym_entry *read_symbol(FILE *in)
{
[...]
	/* include the type field in the symbol name, so that it gets
	 * compressed together */
[...]
	sym->sym[0] = type;
	strcpy(sym_name(sym), name);

It sounds a bit crazy. read_symbol() makes a trick so that the type
can be compressed. This patch does another trick to avoid it.


> +	 */
> +	if (all_symbols)
> +		sym_start_idx = 1;

This looks a bit fragile. My understanding is that the new code in
kernel/kallsyms.c and kernel/module/kallsyms.c depends on this change.

The faster search is used when CONFIG_KALLSYMS_ALL is defined.
But the data are compressed this way when this script is called
with --all-symbols.

Is it guaranteed that this script will generate the needed data
when CONFIG_KALLSYMS_ALL is defined?

What about 3rd party modules?

I would personally suggest to store the symbol type into a separate
sym->type entry in struct sym_entry and never compress it.

IMHO, the size win is not worth the code complexity.

Well, people compiling the kernel for small devices might think
different. But they probably disable kallsyms completely.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y
  2022-09-20 17:26   ` Petr Mladek
@ 2022-09-21  2:42     ` Leizhen (ThunderTown)
  2022-09-21  6:51       ` Leizhen (ThunderTown)
  2022-09-21  7:13       ` Petr Mladek
  0 siblings, 2 replies; 20+ messages in thread
From: Leizhen (ThunderTown) @ 2022-09-21  2:42 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel, Masahiro Yamada, Alexei Starovoitov,
	Jiri Olsa, Kees Cook, Andrew Morton, Luis Chamberlain,
	linux-modules



On 2022/9/21 1:26, Petr Mladek wrote:
> On Fri 2022-09-09 21:00:09, Zhen Lei wrote:
>> Currently, to search for a symbol, we need to expand the symbols in
>> 'kallsyms_names' one by one, and then use the expanded string for
>> comparison. This is very slow.
>>
>> In fact, we can first compress the name being looked up and then use
>> it for comparison when traversing 'kallsyms_names'.
> 
> This does not explain how this patch modifies the compressed data
> and why it is needed.

Yes, I have updated the description from the v3 version.

So if we don't compress the symbol type, we can first compress the
searched symbol and then make a quick comparison based on the compressed
length and content. In this way, for entries with mismatched lengths,
there is no need to expand and compare strings. And for those matching
lengths, there's no need to expand the symbol. This saves a lot of time.

> 
> 
>> This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86
>> with defconfig.
>> Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659
>> After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418
>>
>> However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names'
>> does not change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  scripts/kallsyms.c | 15 ++++++++++++---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
>> index f18e6dfc68c5839..ab6fe7cd014efd1 100644
>> --- a/scripts/kallsyms.c
>> +++ b/scripts/kallsyms.c
>> @@ -60,6 +60,7 @@ static unsigned int table_size, table_cnt;
>>  static int all_symbols;
>>  static int absolute_percpu;
>>  static int base_relative;
>> +static int sym_start_idx;
>>  
>>  static int token_profit[0x10000];
>>  
>> @@ -511,7 +512,7 @@ static void learn_symbol(const unsigned char *symbol, int len)
>>  {
>>  	int i;
>>  
>> -	for (i = 0; i < len - 1; i++)
>> +	for (i = sym_start_idx; i < len - 1; i++)
>>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]++;
> 
> This skips the first character in the @symbol string. I do not see how
> this is used in the new code, for example, in
> kallsyms_on_each_match_symbol(), in the 5th patch. It seems to iterate
> the compressed data from the 0th index:
> 
> 	for (i = 0, off = 0; i < kallsyms_num_syms; i++)
> 
>>  }
>>  
>> @@ -520,7 +521,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
>>  {
>>  	int i;
>>  
>> -	for (i = 0; i < len - 1; i++)
>> +	for (i = sym_start_idx; i < len - 1; i++)
>>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]--;
>>  }
>>  
>> @@ -538,7 +539,7 @@ static unsigned char *find_token(unsigned char *str, int len,
>>  {
>>  	int i;
>>  
>> -	for (i = 0; i < len - 1; i++) {
>> +	for (i = sym_start_idx; i < len - 1; i++) {
>>  		if (str[i] == token[0] && str[i+1] == token[1])
>>  			return &str[i];
>>  	}
>> @@ -780,6 +781,14 @@ int main(int argc, char **argv)
>>  	} else if (argc != 1)
>>  		usage();
>>  
>> +	/*
>> +	 * Skip the symbol type, do not compress it to optimize the performance
>> +	 * of finding or traversing symbols in kernel, this is good for modules
>> +	 * such as livepatch.
> 
> I see. The type is added as the first character here.
> 
> in static struct sym_entry *read_symbol(FILE *in)
> {
> [...]
> 	/* include the type field in the symbol name, so that it gets
> 	 * compressed together */

Good catch. I should remove "so that it gets compressed together"

> [...]
> 	sym->sym[0] = type;
> 	strcpy(sym_name(sym), name);
> 
> It sounds a bit crazy. read_symbol() makes a trick so that the type
> can be compressed. This patch does another trick to avoid it.
> 
> 
>> +	 */
>> +	if (all_symbols)
>> +		sym_start_idx = 1;
> 
> This looks a bit fragile. My understanding is that the new code in
> kernel/kallsyms.c and kernel/module/kallsyms.c depends on this change.

They do not depend on this change, because the index in
insert_real_symbols_in_table() is still starting from 0. kallsyms_expand_symbol()
shows that it uses every byte of the compressed data to look up the token table.
The index in insert_real_symbols_in_table() starting from 0 make sure that the
raw character of 'type' occupies a separate position in kallsyms_token_table[].
So that kallsyms_expand_symbol() can still work well.

> 
> The faster search is used when CONFIG_KALLSYMS_ALL is defined.
> But the data are compressed this way when this script is called
> with --all-symbols.
> 
> Is it guaranteed that this script will generate the needed data
> when CONFIG_KALLSYMS_ALL is defined?

Yes, see kallsyms() in scripts/link-vmlinux.sh
	if is_enabled CONFIG_KALLSYMS_ALL; then
                kallsymopt="${kallsymopt} --all-symbols"
        fi

> 
> What about 3rd party modules?

Should they call the API directly?

> 
> I would personally suggest to store the symbol type into a separate
> sym->type entry in struct sym_entry and never compress it.

Yes,I've also considered this, for the purpose of increasing the
compression ratio. See below, if the sorting is performed based on
the address and then based on the type. We can record all the symbol
type information in less than 100 bytes. Of course, this makes the
functions that look up symbols based on the address loop serveral
times more. However, I would like to wait until the current patch
series is accepted. Otherwise, I'll have to rework a lot of patches
and it's too much work. To be honest, I've been coding for it these days.

cat /proc/kallsyms | awk '{print $2}' | sort | uniq -c | sort -r
  44678 r
  38299 t
  28315 T
  11644 d
   3768 D
   2778 b
    778 R
    641 B
    282 A
    178 W
     37 V

> 
> IMHO, the size win is not worth the code complexity.
> 
> Well, people compiling the kernel for small devices might think
> different. But they probably disable kallsyms completely.

Yes, to make the code look better, I've stopped binding CONFIG_KALLSYMS_ALL since v3.

3. The symbol type is not compressed regardless of whether
   CONFIG_KALLSYMS_ALL is set or not. The memory overhead is increased
   by less than 20KiB if CONFIG_KALLSYMS_ALL=n.

> 
> Best Regards,
> Petr
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y
  2022-09-21  2:42     ` Leizhen (ThunderTown)
@ 2022-09-21  6:51       ` Leizhen (ThunderTown)
  2022-09-21  7:13       ` Petr Mladek
  1 sibling, 0 replies; 20+ messages in thread
From: Leizhen (ThunderTown) @ 2022-09-21  6:51 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel, Masahiro Yamada, Alexei Starovoitov,
	Jiri Olsa, Kees Cook, Andrew Morton, Luis Chamberlain,
	linux-modules



On 2022/9/21 10:42, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/9/21 1:26, Petr Mladek wrote:
>> On Fri 2022-09-09 21:00:09, Zhen Lei wrote:
>>> Currently, to search for a symbol, we need to expand the symbols in
>>> 'kallsyms_names' one by one, and then use the expanded string for
>>> comparison. This is very slow.
>>>
>>> In fact, we can first compress the name being looked up and then use
>>> it for comparison when traversing 'kallsyms_names'.
>>
>> This does not explain how this patch modifies the compressed data
>> and why it is needed.
> 
> Yes, I have updated the description from the v3 version.
> 
> So if we don't compress the symbol type, we can first compress the
> searched symbol and then make a quick comparison based on the compressed
> length and content. In this way, for entries with mismatched lengths,
> there is no need to expand and compare strings. And for those matching
> lengths, there's no need to expand the symbol. This saves a lot of time.
> 
>>
>>
>>> This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86
>>> with defconfig.
>>> Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659
>>> After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418
>>>
>>> However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names'
>>> does not change.
>>>
>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>> ---
>>>  scripts/kallsyms.c | 15 ++++++++++++---
>>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
>>> index f18e6dfc68c5839..ab6fe7cd014efd1 100644
>>> --- a/scripts/kallsyms.c
>>> +++ b/scripts/kallsyms.c
>>> @@ -60,6 +60,7 @@ static unsigned int table_size, table_cnt;
>>>  static int all_symbols;
>>>  static int absolute_percpu;
>>>  static int base_relative;
>>> +static int sym_start_idx;
>>>  
>>>  static int token_profit[0x10000];
>>>  
>>> @@ -511,7 +512,7 @@ static void learn_symbol(const unsigned char *symbol, int len)
>>>  {
>>>  	int i;
>>>  
>>> -	for (i = 0; i < len - 1; i++)
>>> +	for (i = sym_start_idx; i < len - 1; i++)
>>>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]++;
>>
>> This skips the first character in the @symbol string. I do not see how
>> this is used in the new code, for example, in
>> kallsyms_on_each_match_symbol(), in the 5th patch. It seems to iterate
>> the compressed data from the 0th index:
>>
>> 	for (i = 0, off = 0; i < kallsyms_num_syms; i++)
>>
>>>  }
>>>  
>>> @@ -520,7 +521,7 @@ static void forget_symbol(const unsigned char *symbol, int len)
>>>  {
>>>  	int i;
>>>  
>>> -	for (i = 0; i < len - 1; i++)
>>> +	for (i = sym_start_idx; i < len - 1; i++)
>>>  		token_profit[ symbol[i] + (symbol[i + 1] << 8) ]--;
>>>  }
>>>  
>>> @@ -538,7 +539,7 @@ static unsigned char *find_token(unsigned char *str, int len,
>>>  {
>>>  	int i;
>>>  
>>> -	for (i = 0; i < len - 1; i++) {
>>> +	for (i = sym_start_idx; i < len - 1; i++) {
>>>  		if (str[i] == token[0] && str[i+1] == token[1])
>>>  			return &str[i];
>>>  	}
>>> @@ -780,6 +781,14 @@ int main(int argc, char **argv)
>>>  	} else if (argc != 1)
>>>  		usage();
>>>  
>>> +	/*
>>> +	 * Skip the symbol type, do not compress it to optimize the performance
>>> +	 * of finding or traversing symbols in kernel, this is good for modules
>>> +	 * such as livepatch.
>>
>> I see. The type is added as the first character here.
>>
>> in static struct sym_entry *read_symbol(FILE *in)
>> {
>> [...]
>> 	/* include the type field in the symbol name, so that it gets
>> 	 * compressed together */
> 
> Good catch. I should remove "so that it gets compressed together"
> 
>> [...]
>> 	sym->sym[0] = type;
>> 	strcpy(sym_name(sym), name);
>>
>> It sounds a bit crazy. read_symbol() makes a trick so that the type
>> can be compressed. This patch does another trick to avoid it.
>>
>>
>>> +	 */
>>> +	if (all_symbols)
>>> +		sym_start_idx = 1;
>>
>> This looks a bit fragile. My understanding is that the new code in
>> kernel/kallsyms.c and kernel/module/kallsyms.c depends on this change.
> 
> They do not depend on this change, because the index in
> insert_real_symbols_in_table() is still starting from 0. kallsyms_expand_symbol()
> shows that it uses every byte of the compressed data to look up the token table.
> The index in insert_real_symbols_in_table() starting from 0 make sure that the
> raw character of 'type' occupies a separate position in kallsyms_token_table[].
> So that kallsyms_expand_symbol() can still work well.
> 
>>
>> The faster search is used when CONFIG_KALLSYMS_ALL is defined.
>> But the data are compressed this way when this script is called
>> with --all-symbols.
>>
>> Is it guaranteed that this script will generate the needed data
>> when CONFIG_KALLSYMS_ALL is defined?
> 
> Yes, see kallsyms() in scripts/link-vmlinux.sh
> 	if is_enabled CONFIG_KALLSYMS_ALL; then
>                 kallsymopt="${kallsymopt} --all-symbols"
>         fi
> 
>>
>> What about 3rd party modules?
> 
> Should they call the API directly?
> 
>>
>> I would personally suggest to store the symbol type into a separate
>> sym->type entry in struct sym_entry and never compress it.
> 
> Yes,I've also considered this, for the purpose of increasing the
> compression ratio. See below, if the sorting is performed based on
> the address and then based on the type. We can record all the symbol
> type information in less than 100 bytes. Of course, this makes the
> functions that look up symbols based on the address loop serveral
> times more. However, I would like to wait until the current patch
> series is accepted. Otherwise, I'll have to rework a lot of patches
> and it's too much work. To be honest, I've been coding for it these days.

The main thing is, I don't know if there will be any other impact
after this change. If this doesn't work, I don't have to do it all
over again. I can refactor it after it have been fully reviewed.

> 
> cat /proc/kallsyms | awk '{print $2}' | sort | uniq -c | sort -r
>   44678 r
>   38299 t
>   28315 T
>   11644 d
>    3768 D
>    2778 b
>     778 R
>     641 B
>     282 A
>     178 W
>      37 V
> 
>>
>> IMHO, the size win is not worth the code complexity.
>>
>> Well, people compiling the kernel for small devices might think
>> different. But they probably disable kallsyms completely.
> 
> Yes, to make the code look better, I've stopped binding CONFIG_KALLSYMS_ALL since v3.
> 
> 3. The symbol type is not compressed regardless of whether
>    CONFIG_KALLSYMS_ALL is set or not. The memory overhead is increased
>    by less than 20KiB if CONFIG_KALLSYMS_ALL=n.
> 
>>
>> Best Regards,
>> Petr
>> .
>>
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol()
  2022-09-20 14:01     ` Leizhen (ThunderTown)
@ 2022-09-21  6:56       ` Petr Mladek
  0 siblings, 0 replies; 20+ messages in thread
From: Petr Mladek @ 2022-09-21  6:56 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel, Masahiro Yamada, Alexei Starovoitov,
	Jiri Olsa, Kees Cook, Andrew Morton, Luis Chamberlain,
	linux-modules

On Tue 2022-09-20 22:01:44, Leizhen (ThunderTown) wrote:
> On 2022/9/20 20:08, Petr Mladek wrote:
> > On Fri 2022-09-09 21:00:15, Zhen Lei wrote:
> > 3. As a result the *mod parameter won't be used by any existing
> >    fn() callback and could be removed. This should be done as
> >    a separate patch. It touches also ftrace_lookup_symbols().
> 
> OK, I will do it tomorrow. The next version is v5.

Please, wait a bit. I have missed that there was already v4
that solved some my concerns. I am going to look at it.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y
  2022-09-21  2:42     ` Leizhen (ThunderTown)
  2022-09-21  6:51       ` Leizhen (ThunderTown)
@ 2022-09-21  7:13       ` Petr Mladek
  1 sibling, 0 replies; 20+ messages in thread
From: Petr Mladek @ 2022-09-21  7:13 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Joe Lawrence,
	live-patching, linux-kernel, Masahiro Yamada, Alexei Starovoitov,
	Jiri Olsa, Kees Cook, Andrew Morton, Luis Chamberlain,
	linux-modules

On Wed 2022-09-21 10:42:56, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/9/21 1:26, Petr Mladek wrote:
> > On Fri 2022-09-09 21:00:09, Zhen Lei wrote:
> >> Currently, to search for a symbol, we need to expand the symbols in
> >> 'kallsyms_names' one by one, and then use the expanded string for
> >> comparison. This is very slow.
> >>
> >> In fact, we can first compress the name being looked up and then use
> >> it for comparison when traversing 'kallsyms_names'.
> > 
> > This does not explain how this patch modifies the compressed data
> > and why it is needed.
> 
> Yes, I have updated the description from the v3 version.

Ah, there is even v4. I have missed that. The commit message looks
much better there.

> So if we don't compress the symbol type, we can first compress the
> searched symbol and then make a quick comparison based on the compressed
> length and content. In this way, for entries with mismatched lengths,
> there is no need to expand and compare strings. And for those matching
> lengths, there's no need to expand the symbol. This saves a lot of time.
> 
> > 
> > 
> >> This increases the size of 'kallsyms_names'. About 48KiB, 2.67%, on x86
> >> with defconfig.
> >> Before: kallsyms_num_syms=131392, sizeof(kallsyms_names)=1823659
> >> After : kallsyms_num_syms=131392, sizeof(kallsyms_names)=1872418
> >>
> >> However, if CONFIG_KALLSYMS_ALL is not set, the size of 'kallsyms_names'
> >> does not change.
> >>
> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >> ---
> >>  scripts/kallsyms.c | 15 ++++++++++++---
> >>  1 file changed, 12 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
> >> index f18e6dfc68c5839..ab6fe7cd014efd1 100644
> >> --- a/scripts/kallsyms.c
> >> +++ b/scripts/kallsyms.c
> >> +	 */
> >> +	if (all_symbols)
> >> +		sym_start_idx = 1;
> > 
> > This looks a bit fragile. My understanding is that the new code in
> > kernel/kallsyms.c and kernel/module/kallsyms.c depends on this change.
> 
> They do not depend on this change, because the index in
> insert_real_symbols_in_table() is still starting from 0. kallsyms_expand_symbol()
> shows that it uses every byte of the compressed data to look up the token table.
> The index in insert_real_symbols_in_table() starting from 0 make sure that the
> raw character of 'type' occupies a separate position in kallsyms_token_table[].
> So that kallsyms_expand_symbol() can still work well.

I guess that we are talking about different things. Anyway, please
ignore my concern about that it is fragile. The change in
scripts/kallsyms.c does not longer depend on --all-symbols parameter
in the last v4 patchset.

> > I would personally suggest to store the symbol type into a separate
> > sym->type entry in struct sym_entry and never compress it.
> 
> Yes,I've also considered this, for the purpose of increasing the
> compression ratio. See below, if the sorting is performed based on
> the address and then based on the type. We can record all the symbol
> type information in less than 100 bytes. Of course, this makes the
> functions that look up symbols based on the address loop serveral
> times more. However, I would like to wait until the current patch
> series is accepted. Otherwise, I'll have to rework a lot of patches
> and it's too much work. To be honest, I've been coding for it these days.
> 
> cat /proc/kallsyms | awk '{print $2}' | sort | uniq -c | sort -r
>   44678 r
>   38299 t
>   28315 T
>   11644 d
>    3768 D
>    2778 b
>     778 R
>     641 B
>     282 A
>     178 W
>      37 V

This is another optimization. I agree that we could do it later after
this patchset is accepted.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-09-21  7:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-09 13:00 [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Zhen Lei
2022-09-09 13:00 ` [PATCH v2 1/8] scripts/kallsyms: don't compress symbol type when CONFIG_KALLSYMS_ALL=y Zhen Lei
2022-09-20 17:26   ` Petr Mladek
2022-09-21  2:42     ` Leizhen (ThunderTown)
2022-09-21  6:51       ` Leizhen (ThunderTown)
2022-09-21  7:13       ` Petr Mladek
2022-09-09 13:00 ` [PATCH v2 2/8] scripts/kallsyms: rename build_initial_tok_table() Zhen Lei
2022-09-09 13:00 ` [PATCH v2 3/8] kallsyms: Adjust the types of some local variables Zhen Lei
2022-09-09 13:00 ` [PATCH v2 4/8] kallsyms: Improve the performance of kallsyms_lookup_name() Zhen Lei
2022-09-09 13:00 ` [PATCH v2 5/8] kallsyms: Add helper kallsyms_on_each_match_symbol() Zhen Lei
2022-09-09 13:00 ` [PATCH v2 6/8] livepatch: Use kallsyms_on_each_match_symbol() to improve performance Zhen Lei
2022-09-09 13:00 ` [PATCH v2 7/8] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
2022-09-20 12:08   ` Petr Mladek
2022-09-20 14:01     ` Leizhen (ThunderTown)
2022-09-21  6:56       ` Petr Mladek
2022-09-09 13:00 ` [PATCH v2 8/8] kallsyms: Add self-test facility Zhen Lei
2022-09-17  8:07   ` Kees Cook
2022-09-17 12:40     ` Leizhen (ThunderTown)
2022-09-16  1:27 ` [PATCH v2 0/8] kallsyms: Optimizes the performance of lookup symbols Leizhen (ThunderTown)
2022-09-16  3:17   ` Leizhen (ThunderTown)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).