* [PATCH 1/3] objtool: Write .orc_lookup section header
2020-08-07 4:17 ` [PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool Huaixin Chang
@ 2020-08-07 4:18 ` Huaixin Chang
2020-08-07 4:18 ` [PATCH 2/3] scripts/sorttable: Build ORC fast lookup table via sorttable tool Huaixin Chang
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Huaixin Chang @ 2020-08-07 4:18 UTC (permalink / raw)
To: changhuaixin
Cc: bp, hpa, jpoimboe, linux-kbuild, linux-kernel, luto, michal.lkml,
mingo, peterz, tglx, x86, yamada.masahiro
The purpose of this patch is to set sh_type to SHT_PROGBITS and remove
write bits away from sh_flags. In order to write section header, just
call elf_create_section() upon section orc_lookup with 0 entry written.
Originally, section headers are as follows:
[23] .orc_unwind_ip PROGBITS ffffffff8259f4b8 0179f4b8
0000000000178bbc 0000000000000000 A 0 0 1
[24] .rela.orc_unwind_ RELA 0000000000000000 11e57b58
00000000008d4668 0000000000000018 I 70 23 8
[25] .orc_unwind PROGBITS ffffffff82718074 01918074
000000000023519a 0000000000000000 A 0 0 1
[26] .orc_lookup NOBITS ffffffff8294d210 01b4d20e
0000000000030038 0000000000000000 WA 0 0 1
[27] .vvar PROGBITS ffffffff8297e000 01b7e000
0000000000001000 0000000000000000 WA 0 0 16
Now, they are changed to:
[23] .orc_unwind_ip PROGBITS ffffffff8259f4b8 0179f4b8
0000000000178bbc 0000000000000000 A 0 0 1
[24] .rela.orc_unwind_ RELA 0000000000000000 11e57b58
00000000008d4668 0000000000000018 I 70 23 8
[25] .orc_unwind PROGBITS ffffffff82718074 01918074
000000000023519a 0000000000000000 A 0 0 1
[26] .orc_lookup PROGBITS ffffffff8294d210 01b4d210
0000000000030038 0000000000000000 A 0 0 1
[27] .vvar PROGBITS ffffffff8297e000 01b7e000
0000000000001000 0000000000000000 WA 0 0 16
Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
---
tools/objtool/orc_gen.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 968f55e6dd94..2b2653979ad6 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -189,6 +189,10 @@ int create_orc_sections(struct objtool_file *file)
u_sec = elf_create_section(file->elf, ".orc_unwind",
sizeof(struct orc_entry), idx);
+ /* make flags of section orc_lookup right */
+ if (!elf_create_section(file->elf, ".orc_lookup", sizeof(int), 0))
+ return -1;
+
/* populate sections */
idx = 0;
for_each_sec(file, sec) {
--
2.14.4.44.g2045bb6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] scripts/sorttable: Build ORC fast lookup table via sorttable tool
2020-08-07 4:17 ` [PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool Huaixin Chang
2020-08-07 4:18 ` [PATCH 1/3] objtool: Write .orc_lookup section header Huaixin Chang
@ 2020-08-07 4:18 ` Huaixin Chang
2020-08-07 4:18 ` [PATCH 3/3] x86/unwind/orc: Simplify unwind_init() for x86 boot Huaixin Chang
2020-08-19 3:03 ` [PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool changhuaixin
3 siblings, 0 replies; 9+ messages in thread
From: Huaixin Chang @ 2020-08-07 4:18 UTC (permalink / raw)
To: changhuaixin
Cc: bp, hpa, jpoimboe, linux-kbuild, linux-kernel, luto, michal.lkml,
mingo, peterz, tglx, x86, yamada.masahiro
Since ORC tables are already sorted by sorttable tool, let us move
building of fast lookup table into sorttable tool too. This saves us
6380us from boot time under Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz
with 64 cores.
Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
arch/x86/include/asm/orc_lookup.h | 16 ------
arch/x86/include/asm/orc_types.h | 16 ++++++
arch/x86/kernel/vmlinux.lds.S | 2 +-
scripts/sorttable.h | 96 +++++++++++++++++++++++++++++++---
tools/arch/x86/include/asm/orc_types.h | 16 ++++++
5 files changed, 122 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/asm/orc_lookup.h b/arch/x86/include/asm/orc_lookup.h
index 241631282e43..c75eb1f82bdb 100644
--- a/arch/x86/include/asm/orc_lookup.h
+++ b/arch/x86/include/asm/orc_lookup.h
@@ -5,22 +5,6 @@
#ifndef _ORC_LOOKUP_H
#define _ORC_LOOKUP_H
-/*
- * This is a lookup table for speeding up access to the .orc_unwind table.
- * Given an input address offset, the corresponding lookup table entry
- * specifies a subset of the .orc_unwind table to search.
- *
- * Each block represents the end of the previous range and the start of the
- * next range. An extra block is added to give the last range an end.
- *
- * The block size should be a power of 2 to avoid a costly 'div' instruction.
- *
- * A block size of 256 was chosen because it roughly doubles unwinder
- * performance while only adding ~5% to the ORC data footprint.
- */
-#define LOOKUP_BLOCK_ORDER 8
-#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER)
-
#ifndef LINKER_SCRIPT
extern unsigned int orc_lookup[];
diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index d25534940bde..b93c6a7b4da4 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -9,6 +9,22 @@
#include <linux/types.h>
#include <linux/compiler.h>
+/*
+ * This is a lookup table for speeding up access to the .orc_unwind table.
+ * Given an input address offset, the corresponding lookup table entry
+ * specifies a subset of the .orc_unwind table to search.
+ *
+ * Each block represents the end of the previous range and the start of the
+ * next range. An extra block is added to give the last range an end.
+ *
+ * The block size should be a power of 2 to avoid a costly 'div' instruction.
+ *
+ * A block size of 256 was chosen because it roughly doubles unwinder
+ * performance while only adding ~5% to the ORC data footprint.
+ */
+#define LOOKUP_BLOCK_ORDER 8
+#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER)
+
/*
* The ORC_REG_* registers are base registers which are used to find other
* registers on the stack.
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 9a03e5b23135..75760e7f6319 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -29,7 +29,7 @@
#include <asm/asm-offsets.h>
#include <asm/thread_info.h>
#include <asm/page_types.h>
-#include <asm/orc_lookup.h>
+#include <asm/orc_types.h>
#include <asm/cache.h>
#include <asm/boot.h>
diff --git a/scripts/sorttable.h b/scripts/sorttable.h
index a2baa2fefb13..de9822f8ae8f 100644
--- a/scripts/sorttable.h
+++ b/scripts/sorttable.h
@@ -93,12 +93,50 @@
char g_err[ERRSTR_MAXSZ];
int *g_orc_ip_table;
struct orc_entry *g_orc_table;
+static unsigned long orc_ip_table_offset;
pthread_t orc_sort_thread;
+struct orc_sort_param {
+ size_t lookup_table_size;
+ unsigned int *orc_lookup_table;
+ unsigned long start_ip;
+ size_t text_size;
+ unsigned int orc_num_entries;
+};
+
static inline unsigned long orc_ip(const int *ip)
{
- return (unsigned long)ip + *ip;
+ return (unsigned long)ip + *ip + orc_ip_table_offset;
+}
+
+static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table,
+ unsigned int num_entries, unsigned long ip)
+{
+ int *first = ip_table;
+ int *last = ip_table + num_entries - 1;
+ int *mid = first, *found = first;
+
+ if (!num_entries)
+ return NULL;
+
+ /*
+ * Do a binary range search to find the rightmost duplicate of a given
+ * starting address. Some entries are section terminators which are
+ * "weak" entries for ensuring there are no gaps. They should be
+ * ignored when they conflict with a real entry.
+ */
+ while (first <= last) {
+ mid = first + ((last - first) / 2);
+
+ if (orc_ip(mid) <= ip) {
+ found = mid;
+ first = mid + 1;
+ } else
+ last = mid - 1;
+ }
+
+ return u_table + (found - ip_table);
}
static int orc_sort_cmp(const void *_a, const void *_b)
@@ -130,18 +168,24 @@ static void *sort_orctable(void *arg)
int *idxs = NULL;
int *tmp_orc_ip_table = NULL;
struct orc_entry *tmp_orc_table = NULL;
- unsigned int *orc_ip_size = (unsigned int *)arg;
- unsigned int num_entries = *orc_ip_size / sizeof(int);
+ struct orc_sort_param *param = (struct orc_sort_param *)arg;
+ unsigned int num_entries = param->orc_num_entries;
+ unsigned int orc_ip_size = num_entries * sizeof(int);
unsigned int orc_size = num_entries * sizeof(struct orc_entry);
+ unsigned int lookup_num_blocks = param->lookup_table_size / sizeof(int);
+ unsigned int *orc_lookup = param->orc_lookup_table;
+ unsigned long lookup_start_ip = param->start_ip;
+ unsigned long lookup_stop_ip = param->start_ip + param->text_size;
+ struct orc_entry *orc;
- idxs = (int *)malloc(*orc_ip_size);
+ idxs = (int *)malloc(orc_ip_size);
if (!idxs) {
snprintf(g_err, ERRSTR_MAXSZ, "malloc idxs: %s",
strerror(errno));
pthread_exit(g_err);
}
- tmp_orc_ip_table = (int *)malloc(*orc_ip_size);
+ tmp_orc_ip_table = (int *)malloc(orc_ip_size);
if (!tmp_orc_ip_table) {
snprintf(g_err, ERRSTR_MAXSZ, "malloc tmp_orc_ip_table: %s",
strerror(errno));
@@ -173,6 +217,28 @@ static void *sort_orctable(void *arg)
g_orc_table[i] = tmp_orc_table[idxs[i]];
}
+ for (i = 0; i < lookup_num_blocks-1; i++) {
+ orc = __orc_find(g_orc_ip_table, g_orc_table,
+ num_entries,
+ lookup_start_ip + (LOOKUP_BLOCK_SIZE * i));
+ if (!orc) {
+ snprintf(g_err, ERRSTR_MAXSZ,
+ "Corrupt .orc_unwind table\n");
+ pthread_exit(g_err);
+ }
+
+ orc_lookup[i] = orc - g_orc_table;
+ }
+
+ /* Initialize the ending block: */
+ orc = __orc_find(g_orc_ip_table, g_orc_table, num_entries,
+ lookup_stop_ip);
+ if (!orc) {
+ snprintf(g_err, ERRSTR_MAXSZ, "Corrupt .orc_unwind table\n");
+ pthread_exit(g_err);
+ }
+ orc_lookup[lookup_num_blocks-1] = orc - g_orc_table;
+
free(idxs);
free(tmp_orc_ip_table);
free(tmp_orc_table);
@@ -221,6 +287,8 @@ static int do_sort(Elf_Ehdr *ehdr,
unsigned int orc_ip_size = 0;
unsigned int orc_size = 0;
unsigned int orc_num_entries = 0;
+ unsigned long orc_ip_addr = 0;
+ struct orc_sort_param param;
#endif
shstrndx = r2(&ehdr->e_shstrndx);
@@ -259,17 +327,27 @@ static int do_sort(Elf_Ehdr *ehdr,
orc_ip_size = s->sh_size;
g_orc_ip_table = (int *)((void *)ehdr +
s->sh_offset);
+ orc_ip_addr = s->sh_addr;
}
if (!strcmp(secstrings + idx, ".orc_unwind")) {
orc_size = s->sh_size;
g_orc_table = (struct orc_entry *)((void *)ehdr +
s->sh_offset);
}
+ if (!strcmp(secstrings + idx, ".orc_lookup")) {
+ param.lookup_table_size = s->sh_size;
+ param.orc_lookup_table = (unsigned int *)
+ ((void *)ehdr + s->sh_offset);
+ }
+ if (!strcmp(secstrings + idx, ".text")) {
+ param.text_size = s->sh_size;
+ param.start_ip = s->sh_addr;
+ }
#endif
} /* for loop */
#if defined(SORTTABLE_64) && defined(UNWINDER_ORC_ENABLED)
- if (!g_orc_ip_table || !g_orc_table) {
+ if (!g_orc_ip_table || !g_orc_table || !param.orc_lookup_table) {
fprintf(stderr,
"incomplete ORC unwind tables in file: %s\n", fname);
goto out;
@@ -285,9 +363,13 @@ static int do_sort(Elf_Ehdr *ehdr,
goto out;
}
+ /* Make orc_ip return virtual address at execution. */
+ orc_ip_table_offset = orc_ip_addr - (unsigned long)g_orc_ip_table;
+
/* create thread to sort ORC unwind tables concurrently */
+ param.orc_num_entries = orc_num_entries;
if (pthread_create(&orc_sort_thread, NULL,
- sort_orctable, &orc_ip_size)) {
+ sort_orctable, ¶m)) {
fprintf(stderr,
"pthread_create orc_sort_thread failed '%s': %s\n",
strerror(errno), fname);
diff --git a/tools/arch/x86/include/asm/orc_types.h b/tools/arch/x86/include/asm/orc_types.h
index d25534940bde..b93c6a7b4da4 100644
--- a/tools/arch/x86/include/asm/orc_types.h
+++ b/tools/arch/x86/include/asm/orc_types.h
@@ -9,6 +9,22 @@
#include <linux/types.h>
#include <linux/compiler.h>
+/*
+ * This is a lookup table for speeding up access to the .orc_unwind table.
+ * Given an input address offset, the corresponding lookup table entry
+ * specifies a subset of the .orc_unwind table to search.
+ *
+ * Each block represents the end of the previous range and the start of the
+ * next range. An extra block is added to give the last range an end.
+ *
+ * The block size should be a power of 2 to avoid a costly 'div' instruction.
+ *
+ * A block size of 256 was chosen because it roughly doubles unwinder
+ * performance while only adding ~5% to the ORC data footprint.
+ */
+#define LOOKUP_BLOCK_ORDER 8
+#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER)
+
/*
* The ORC_REG_* registers are base registers which are used to find other
* registers on the stack.
--
2.14.4.44.g2045bb6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] x86/unwind/orc: Simplify unwind_init() for x86 boot
2020-08-07 4:17 ` [PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool Huaixin Chang
2020-08-07 4:18 ` [PATCH 1/3] objtool: Write .orc_lookup section header Huaixin Chang
2020-08-07 4:18 ` [PATCH 2/3] scripts/sorttable: Build ORC fast lookup table via sorttable tool Huaixin Chang
@ 2020-08-07 4:18 ` Huaixin Chang
2020-08-19 3:03 ` [PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool changhuaixin
3 siblings, 0 replies; 9+ messages in thread
From: Huaixin Chang @ 2020-08-07 4:18 UTC (permalink / raw)
To: changhuaixin
Cc: bp, hpa, jpoimboe, linux-kbuild, linux-kernel, luto, michal.lkml,
mingo, peterz, tglx, x86, yamada.masahiro
The ORC fast lookup table is built by scripts/sorttable tool. All that
is left is setting lookup_num_blocks.
Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com>
Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
arch/x86/kernel/unwind_orc.c | 41 ++---------------------------------------
1 file changed, 2 insertions(+), 39 deletions(-)
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index ec88bbe08a32..29890389b4f6 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -264,48 +264,11 @@ void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size,
void __init unwind_init(void)
{
- size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip;
- size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind;
- size_t num_entries = orc_ip_size / sizeof(int);
- struct orc_entry *orc;
- int i;
-
- if (!num_entries || orc_ip_size % sizeof(int) != 0 ||
- orc_size % sizeof(struct orc_entry) != 0 ||
- num_entries != orc_size / sizeof(struct orc_entry)) {
- orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n");
- return;
- }
-
/*
- * Note, the orc_unwind and orc_unwind_ip tables were already
- * sorted at build time via the 'sorttable' tool.
- * It's ready for binary search straight away, no need to sort it.
+ * All ORC tables are sorted and built via sorttable tool. Initialize
+ * lookup_num_blocks only.
*/
-
- /* Initialize the fast lookup table: */
lookup_num_blocks = orc_lookup_end - orc_lookup;
- for (i = 0; i < lookup_num_blocks-1; i++) {
- orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
- num_entries,
- LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i));
- if (!orc) {
- orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
- return;
- }
-
- orc_lookup[i] = orc - __start_orc_unwind;
- }
-
- /* Initialize the ending block: */
- orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries,
- LOOKUP_STOP_IP);
- if (!orc) {
- orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
- return;
- }
- orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind;
-
orc_init = true;
}
--
2.14.4.44.g2045bb6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool
2020-08-07 4:17 ` [PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool Huaixin Chang
` (2 preceding siblings ...)
2020-08-07 4:18 ` [PATCH 3/3] x86/unwind/orc: Simplify unwind_init() for x86 boot Huaixin Chang
@ 2020-08-19 3:03 ` changhuaixin
3 siblings, 0 replies; 9+ messages in thread
From: changhuaixin @ 2020-08-19 3:03 UTC (permalink / raw)
To: Ingo Molnar
Cc: bp, hpa, jpoimboe, linux-kbuild, linux-kernel, luto, michal.lkml,
mingo, peterz, tglx, x86, yamada.masahiro
Hi,Ingo
This patchset reverts the hacks from patchset v1. Also it includes some other fixes upon v1 as suggested.
Will you please have a look at this?
The previous links are:
https://lore.kernel.org/lkml/20200724135531.GB648324@gmail.com/
Thanks,
huaixin
> On Aug 7, 2020, at 12:17 PM, Huaixin Chang <changhuaixin@linux.alibaba.com> wrote:
>
> Move building of fast lookup table from boot to sorttable tool. This saves us
> 6380us boot time on Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz with cores. It
> adds a little more than 7ms to boot time when testing on the same CPU.
>
> Changelog v2:
> 1. Write .orc_lookup section header via objtool
> 2. Move two ORC lookup table macro from orc_lookup.h into orc_types.h
> 3. Spell 'ORC' in capitalized fashion
>
> Huaixin Chang (3):
> objtool: Write .orc_lookup section header
> scripts/sorttable: Build ORC fast lookup table via sorttable tool
> x86/unwind/orc: Simplify unwind_init() for x86 boot
>
> arch/x86/include/asm/orc_lookup.h | 16 ------
> arch/x86/include/asm/orc_types.h | 16 ++++++
> arch/x86/kernel/unwind_orc.c | 41 +--------------
> arch/x86/kernel/vmlinux.lds.S | 2 +-
> scripts/sorttable.h | 96 +++++++++++++++++++++++++++++++---
> tools/arch/x86/include/asm/orc_types.h | 16 ++++++
> tools/objtool/orc_gen.c | 4 ++
> 7 files changed, 128 insertions(+), 63 deletions(-)
>
> --
> 2.14.4.44.g2045bb6
^ permalink raw reply [flat|nested] 9+ messages in thread