linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules
@ 2022-02-08 18:43 Nick Alcock
  2022-02-08 18:43 ` [PATCH v8 1/6] kbuild: bring back tristate.conf Nick Alcock
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Nick Alcock @ 2022-02-08 18:43 UTC (permalink / raw)
  To: mcgrof, masahiroy
  Cc: jolsa, rostedt, bas, tglozar, Ast-x64, viktor.malik, dxu, acme,
	adrian.hunter, ak, irogers, linux-kbuild, linux-modules,
	linux-kernel, arnd, akpm, eugene.loh, kris.van.hees

The kallmodsyms patch series was originally posted in Nov 2019, and the thread
(https://lore.kernel.org/linux-kbuild/20191114223036.9359-1-eugene.loh@oracle.com/t/#u)
shows review comments, questions, and feedback from interested parties.

All review comments have been satisfied, as far as I know: in particular
Yamada's note about translation units that are shared between built-in modules
is satisfied with a better representation which is also much, much smaller.

A kernel tree containing this series alone, atop -rc3:
   https://github.com/oracle/dtrace-linux-kernel kallmodsyms/5.17-rc3

Trees for trying this out, if you want to try this series in conjunction
with its major current user:

 userspace tree for the dtrace tool itself:
   https://github.com/oracle/dtrace-utils.git, dev branch
 kernel tree comprising this series and a few other patches needed by
 dtrace:
   https://github.com/oracle/dtrace-linux-kernel, v2/5.17-rc2 branch

(See the README.md in the latter for dtrace build instructions.  Note the need for a
reasonably recent binutils, a trunk GCC, and a cross-bpf toolchain.)


/proc/kallsyms is very useful for tracers and other tools that need to
map kernel symbols to addresses.

It would be useful if there were a mapping between kernel symbol and module
name that only changed when the kernel source code is changed.  This mapping
should not change simply because a module becomes built into the kernel, so
that it's not broken by changes in user configuration.  (DTrace for Linux
already uses the approach in this patch for this purpose.)

In brief we do this by mapping from address ranges to object files (with
assistance from the linker map file), then mapping from object files to
potential kernel modules. Because the number of object files is much smaller
than the number of symbols, this is a fairly efficient representation, even with
a bit of extra complexity to allow object files to be in more than one module at
once.

The size impact of all of this is minimal: in one of my tests, vmlinux grew by
0.17% (10824 bytes), and the compressed vmlinux only grew by 0.08% (7552 bytes):
though this is very configuration-dependent, it seems likely to scale roughly
with the kernel as a whole.

This is all controlled by a new config parameter CONFIG_KALLMODSYMS, which when
set results in output in /proc/kallmodsyms that looks like this:

ffffffff8b013d20 409 t pt_buffer_setup_aux
ffffffff8b014130 11f T intel_pt_interrupt
ffffffff8b014250 2d T cpu_emergency_stop_pt
ffffffff8b014280 13a t rapl_pmu_event_init      [intel_rapl_perf]
ffffffff8b0143c0 bb t rapl_event_update [intel_rapl_perf]
ffffffff8b014480 10 t rapl_pmu_event_read       [intel_rapl_perf]
ffffffff8b014490 a3 t rapl_cpu_offline  [intel_rapl_perf]
ffffffff8b014540 24 t __rapl_event_show [intel_rapl_perf]
ffffffff8b014570 f2 t rapl_pmu_event_stop       [intel_rapl_perf]

These symbols are notated as [intel_rapl_perf] even if intel_rapl_perf is built
into the kernel.

Further down, we see what happens when object files are reused by
multiple modules, all of which are built in to the kernel:

ffffffffa22b3aa0 ab t handle_timestamp  [liquidio]
ffffffffa22b3b50 4a t free_netbuf       [liquidio]
ffffffffa22b3ba0 8d t liquidio_ptp_settime      [liquidio]
ffffffffa22b3c30 b3 t liquidio_ptp_adjfreq      [liquidio]
[...]
ffffffffa22b9490 203 t lio_vf_rep_create        [liquidio]
ffffffffa22b96a0 16b t lio_vf_rep_destroy       [liquidio]
ffffffffa22b9810 1f t lio_vf_rep_modinit        [liquidio]
ffffffffa22b9830 1f t lio_vf_rep_modexit        [liquidio]
ffffffffa22b9850 d2 t lio_ethtool_get_channels   [liquidio] [liquidio_vf]
ffffffffa22b9930 9c t lio_ethtool_get_ringparam  [liquidio] [liquidio_vf]
ffffffffa22b99d0 11 t lio_get_msglevel   [liquidio] [liquidio_vf]
ffffffffa22b99f0 11 t lio_vf_set_msglevel        [liquidio] [liquidio_vf]
ffffffffa22b9a10 2b t lio_get_pauseparam         [liquidio] [liquidio_vf]
ffffffffa22b9a40 738 t lio_get_ethtool_stats     [liquidio] [liquidio_vf]
ffffffffa22ba180 368 t lio_vf_get_ethtool_stats  [liquidio] [liquidio_vf]
ffffffffa22ba4f0 37 t lio_get_regs_len   [liquidio] [liquidio_vf]
ffffffffa22ba530 18 t lio_get_priv_flags         [liquidio] [liquidio_vf]
ffffffffa22ba550 2e t lio_set_priv_flags         [liquidio] [liquidio_vf]
ffffffffa22ba580 69 t lio_set_fecparam   [liquidio] [liquidio_vf]
ffffffffa22ba5f0 92 t lio_get_fecparam   [liquidio] [liquidio_vf]
[...]
ffffffffa22cbd10 175 t liquidio_set_mac [liquidio_vf]
ffffffffa22cbe90 ab t handle_timestamp  [liquidio_vf]
ffffffffa22cbf40 4a t free_netbuf       [liquidio_vf]
ffffffffa22cbf90 2b t octnet_link_status_change [liquidio_vf]
ffffffffa22cbfc0 7e t liquidio_vxlan_port_command.constprop.0   [liquidio_vf]

Like /proc/kallsyms, the output is driven by address, so keeps the
curious property of /proc/kallsyms that symbols (like free_netbuf above)
may appear repeatedly with different addresses: but now, unlike in
/proc/kallsyms, we can see that those symbols appear repeatedly because
they are *different symbols* that ultimately belong to different
modules, all of which are built in to the kernel.

Those symbols that come from object files that are genuinely reused and that
appear only once in memory get a /proc/kallmodsyms line with [multiple]
[modules] on it: consumers will have to be ready to handle such lines.

Also, kernel symbols for built-in modules will probably appear interspersed with
other symbols that are part of different modules and non-modular always-built-in
symbols, which, as usual, have no square-bracketed module denotation.

As with /proc/kallsyms, non-root usage produces addresses that are all zero.

I am open to changing the name and/or format of /proc/kallmodsyms, but felt it
best to split it out of /proc/kallsyms to avoid breaking existing kallsyms
parsers.  Another possible syntax might be to use {curly brackets} or something
to denote built-in modules: it might be possible to drop /proc/kallmodsyms and
make /proc/kallsyms emit things in this format.  (Equally, now kallmodsyms data
uses very little space, the CONFIG_KALLMODSYMS config option might be something
people don't want to bother with.)



The commits in this series all have reviewed-by tags: they're all from
internal reviews, so please ignore them.


Differences from v7, December:

 - Adjust for changes in the v5.17 merge window.  Adjust a few commit
   messages and shrink the cover letter.
 - Drop the symbol-size patch, probably better done from userspace.

Differences from v6, November:

 - Adjust for rewrite of confdata machinery in v5.16 (tristate.conf
   handling is now more of a rewrite than a reversion)

Differences from v5, October:

 - Fix generation of mapfiles under UML

Differences from v4, September:

 - Fix building of tristate.conf if missing (usually concealed by the
   syncconfig being run for other reasons, but not always: the kernel
   test robot spotted it).
 - Forward-port atop v5.15-rc3.

Differences from v3, August:

 - Fix a kernel test robot warning in get_ksymbol_core (possible
   use of uninitialized variable if kallmodsyms was wanted but
   kallsyms_module_offsets was not present, which is most unlikely).

Differences from v2, June:

 - Split the series up.  In particular, the size impact of the table
   optimizer is now quantified, and the symbol-size patch is split out and
   turned into an RFC patch, with the /proc/kallmodsyms format before that
   patch lacking a size column.  Some speculation on how to make the symbol
   sizes less space-wasteful is added (but not yet implemented).

 - Drop a couple of unnecessary #includes, one unnecessarily exported
   symbol, and a needless de-staticing.

Differences from v1, a year or so back:

 - Move from a straight symbol->module name mapping to a mapping from
   address-range to TU to module name list, bringing major space savings
   over the previous approach and support for object files used by many
   built-in modules at the same time, at the cost of a slightly more complex
   approach (unavoidably so, I think, given that we have to merge three data
   sources together: the link map in .tmp_vmlinux.ranges, the nm output on
   stdin, and the mapping from TU name to module names in
   modules_thick.builtin).

   We do opportunistic merging of TUs if they cite the same modules and
   reuse module names where doing so is simple: see optimize_obj2mod
   below. I considered more extensive searches for mergeable entries and
   more intricate encodings of the module name list allowing TUs that are
   used by overlapping sets of modules to share their names, but such
   modules are rare enough (and such overlapping sharings are vanishingly
   rare) that it seemed likely to save only a few bytes at the cost of much
   more hard-to-test code. This is doubly true now that the tables needed
   are only a few kilobytes in length.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Signed-off-by: Eugene Loh <eugene.loh@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>

Nick Alcock (6):
  kbuild: bring back tristate.conf
  kbuild: add modules_thick.builtin
  kbuild: generate an address ranges map at vmlinux link time
  kallsyms: introduce sections needed to map symbols to built-in modules
  kallsyms: optimize .kallsyms_modules*
  kallsyms: add /proc/kallmodsyms

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v8 1/6] kbuild: bring back tristate.conf
  2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
@ 2022-02-08 18:43 ` Nick Alcock
  2022-02-08 18:43 ` [PATCH v8 2/6] kbuild: add modules_thick.builtin Nick Alcock
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Nick Alcock @ 2022-02-08 18:43 UTC (permalink / raw)
  To: mcgrof, masahiroy
  Cc: jolsa, rostedt, bas, tglozar, Ast-x64, viktor.malik, dxu, acme,
	adrian.hunter, ak, irogers, linux-kbuild, linux-modules,
	linux-kernel, arnd, akpm, eugene.loh, kris.van.hees,
	Victor Erminpour

tristate.conf was dropped because it is not needed to build a
modules.builtin (although dropping it introduces a few false positives
into modules.builtin support), and doing so avoids one round of
recursion through the build tree to build it.  But kallmodsyms support
requires building a mapping from object file name to built-in module
name for all builtin modules: this seems to me impossible to accomplish
without parsing all makefiles under the influence of tristate.conf,
since the makefiles are the only place this mapping is recorded.

So bring it back for this purpose.  (Thanks to the refactoring in
the 5.16 timeframe, this is basically a reimplementation of commit
8b41fc4454e36fbfdbb23f940d023d4dece2de29 rather than a simple
reversion.)

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Victor Erminpour <victor.erminpour@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
---

Notes:
    v7: rewrite in terms of the new confdata refactoring
    v8: adjust for changes in 5.17 merge window

 Documentation/kbuild/kconfig.rst |  5 ++++
 Makefile                         |  2 +-
 scripts/kconfig/confdata.c       | 42 +++++++++++++++++++++++++++-----
 3 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/Documentation/kbuild/kconfig.rst b/Documentation/kbuild/kconfig.rst
index 5967c79c3baa..e2c78760d442 100644
--- a/Documentation/kbuild/kconfig.rst
+++ b/Documentation/kbuild/kconfig.rst
@@ -162,6 +162,11 @@ KCONFIG_AUTOCONFIG
 This environment variable can be set to specify the path & name of the
 "auto.conf" file.  Its default value is "include/config/auto.conf".
 
+KCONFIG_TRISTATE
+----------------
+This environment variable can be set to specify the path & name of the
+"tristate.conf" file.  Its default value is "include/config/tristate.conf".
+
 KCONFIG_AUTOHEADER
 ------------------
 This environment variable can be set to specify the path & name of the
diff --git a/Makefile b/Makefile
index ceb987e5c87b..199b6f388484 100644
--- a/Makefile
+++ b/Makefile
@@ -716,7 +716,7 @@ $(KCONFIG_CONFIG):
 #
 # Do not use $(call cmd,...) here. That would suppress prompts from syncconfig,
 # so you cannot notice that Kconfig is waiting for the user input.
-%/config/auto.conf %/config/auto.conf.cmd %/generated/autoconf.h: $(KCONFIG_CONFIG)
+%/config/auto.conf %/config/auto.conf.cmd %/generated/autoconf.h %/tristate.conf: $(KCONFIG_CONFIG)
 	$(Q)$(kecho) "  SYNC    $@"
 	$(Q)$(MAKE) -f $(srctree)/Makefile syncconfig
 else # !may-sync-config
diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c
index 59717be31210..21dd9112311c 100644
--- a/scripts/kconfig/confdata.c
+++ b/scripts/kconfig/confdata.c
@@ -216,6 +216,13 @@ static const char *conf_get_autoheader_name(void)
 	return name ? name : "include/generated/autoconf.h";
 }
 
+static const char *conf_get_tristate_name(void)
+{
+	char *name = getenv("KCONFIG_TRISTATE");
+
+	return name ? name : "include/config/tristate.conf";
+}
+
 static int conf_set_sym_val(struct symbol *sym, int def, int def_flags, char *p)
 {
 	char *p2;
@@ -667,8 +674,12 @@ static char *escape_string_value(const char *in)
  */
 enum output_n { OUTPUT_N, OUTPUT_N_AS_UNSET, OUTPUT_N_NONE };
 
+#define PRINT_ESCAPE		0x01
+#define PRINT_UPCASE		0x02
+#define PRINT_TRISTATE_ONLY	0x04
+
 static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n,
-			   bool escape_string)
+			   int flags)
 {
 	const char *val;
 	char *escaped = NULL;
@@ -676,6 +687,9 @@ static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n,
 	if (sym->type == S_UNKNOWN)
 		return;
 
+	if (flags & PRINT_TRISTATE_ONLY && sym->type != S_TRISTATE)
+		return;
+
 	val = sym_get_string_value(sym);
 
 	if ((sym->type == S_BOOLEAN || sym->type == S_TRISTATE) &&
@@ -685,29 +699,38 @@ static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n,
 		return;
 	}
 
-	if (sym->type == S_STRING && escape_string) {
+	if (sym->type == S_STRING && flags & PRINT_ESCAPE) {
 		escaped = escape_string_value(val);
 		val = escaped;
 	}
 
-	fprintf(fp, "%s%s=%s\n", CONFIG_, sym->name, val);
+	if (flags & PRINT_UPCASE)
+		fprintf(fp, "%s%s=%c\n", CONFIG_, sym->name, (char)toupper(*val));
+	else
+		fprintf(fp, "%s%s=%s\n", CONFIG_, sym->name, val);
 
 	free(escaped);
 }
 
 static void print_symbol_for_dotconfig(FILE *fp, struct symbol *sym)
 {
-	__print_symbol(fp, sym, OUTPUT_N_AS_UNSET, true);
+	__print_symbol(fp, sym, OUTPUT_N_AS_UNSET, PRINT_ESCAPE);
 }
 
 static void print_symbol_for_autoconf(FILE *fp, struct symbol *sym)
 {
-	__print_symbol(fp, sym, OUTPUT_N_NONE, false);
+	__print_symbol(fp, sym, OUTPUT_N_NONE, 0);
+}
+
+static void print_symbol_for_tristate(FILE *fp, struct symbol *sym)
+{
+	__print_symbol(fp, sym, OUTPUT_N_NONE, PRINT_ESCAPE | PRINT_UPCASE |
+		       PRINT_TRISTATE_ONLY);
 }
 
 void print_symbol_for_listconfig(struct symbol *sym)
 {
-	__print_symbol(stdout, sym, OUTPUT_N, true);
+	__print_symbol(stdout, sym, OUTPUT_N, PRINT_ESCAPE);
 }
 
 static void print_symbol_for_c(FILE *fp, struct symbol *sym)
@@ -1131,6 +1154,13 @@ int conf_write_autoconf(int overwrite)
 	if (ret)
 		return ret;
 
+
+	ret = __conf_write_autoconf(conf_get_tristate_name(),
+				    print_symbol_for_tristate,
+				    &comment_style_pound);
+	if (ret)
+		return ret;
+
 	/*
 	 * Create include/config/auto.conf. This must be the last step because
 	 * Kbuild has a dependency on auto.conf and this marks the successful
-- 
2.35.0.260.gb82b153193.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 2/6] kbuild: add modules_thick.builtin
  2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
  2022-02-08 18:43 ` [PATCH v8 1/6] kbuild: bring back tristate.conf Nick Alcock
@ 2022-02-08 18:43 ` Nick Alcock
  2022-02-10  0:35   ` Masahiro Yamada
  2022-02-08 18:43 ` [PATCH v8 3/6] kbuild: generate an address ranges map at vmlinux link time Nick Alcock
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Nick Alcock @ 2022-02-08 18:43 UTC (permalink / raw)
  To: mcgrof, masahiroy
  Cc: jolsa, rostedt, bas, tglozar, Ast-x64, viktor.malik, dxu, acme,
	adrian.hunter, ak, irogers, linux-kbuild, linux-modules,
	linux-kernel, arnd, akpm, eugene.loh, kris.van.hees

This is similar to modules.builtin, and constructed in a similar way to
the way that used to be built before commit
8b41fc4454e36fbfdbb23f940d023d4dece2de29, via tristate.conf inclusion
and recursive concatenation up the tree. Unlike modules.builtin,
modules_thick.builtin givs the names of the object files that make up
modules that are comprised of more than one object file, using a syntax
similar to that of makefiles, e.g.:

crypto/crypto.o: crypto/api.o crypto/cipher.o crypto/compress.o crypto/memneq.o
crypto/crypto_algapi.o: crypto/algapi.o crypto/proc.o crypto/scatterwalk.o
crypto/aead.o:
crypto/geniv.o:

(where the latter two are single-file modules).

An upcoming commit will use this mapping to populate /proc/kallmodsyms.

A parser is included that yields a stram of (module, objfile name[])
mappings: it's a bit baroque, but then parsing text files in C is quite
painful, and I'd rather put the complexity in here than in its callers.
The parser is not built in this commit, nor does it have any callers
yet; nor is any rule added that causes modules_thick.builtin to actually
be constructed.  (Again, see a later commit for that.)

I am not wedded to the approach used to construct this file, but I don't
see any other way to do it despite spending a week or so trying to tie
it into Kbuild without using a separate Makefile.modbuiltin: unlike the
names of builtin modules (which are also recorded in the source files
themseves via MODULE_*() macros) the mapping from object file name to
built-in module name is not recorded anywhere but in the makefiles
themselves, so we have to at least reparse them with something to
indicate the builtin-ness of each module (i.e., tristate.conf) if we are
to figure out which modules are built-in and which are not.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
---
 .gitignore                  |   1 +
 Documentation/dontdiff      |   1 +
 Makefile                    |  19 +++-
 scripts/Kbuild.include      |   6 ++
 scripts/Makefile.modbuiltin |  56 ++++++++++
 scripts/modules_thick.c     | 200 ++++++++++++++++++++++++++++++++++++
 scripts/modules_thick.h     |  48 +++++++++
 7 files changed, 330 insertions(+), 1 deletion(-)
 create mode 100644 scripts/Makefile.modbuiltin
 create mode 100644 scripts/modules_thick.c
 create mode 100644 scripts/modules_thick.h

diff --git a/.gitignore b/.gitignore
index 7afd412dadd2..b49cd96f587a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -49,6 +49,7 @@
 *.zst
 Module.symvers
 modules.order
+modules_thick.builtin
 
 #
 # Top-level generic files
diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index 910b30a2a7d9..6a78988547d0 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -183,6 +183,7 @@ modules.builtin
 modules.builtin.modinfo
 modules.nsdeps
 modules.order
+modules_thick.builtin
 modversions.h*
 nconf
 nconf-cfg
diff --git a/Makefile b/Makefile
index 199b6f388484..5e823fe8390f 100644
--- a/Makefile
+++ b/Makefile
@@ -1470,6 +1470,23 @@ __modinst_pre:
 
 endif # CONFIG_MODULES
 
+# modules_thick.builtin maps from kernel modules (or rather the object file
+# names they would have had had they not been built in) to their constituent
+# object files: we can use this to determine which modules any given object
+# file is part of.  (We cannot eliminate the slight redundancy here without
+# double-expansion.)
+
+modthickbuiltin-files := $(addsuffix /modules_thick.builtin, $(build-dirs))
+
+modules_thick.builtin: $(modthickbuiltin-files)
+	$(Q)$(AWK) '!x[$$0]++' $(addsuffix /$@, $(build-dirs)) > $@
+
+# tristate.conf is not included from this Makefile. Add it as a prerequisite
+# here to make it self-healing in case somebody accidentally removes it.
+$(modthickbuiltin-files): include/config/tristate.conf
+	$(Q)$(MAKE) $(modbuiltin)=$(patsubst %/modules_thick.builtin,%,$@) builtin-file=modules_thick.builtin
+
+
 ###
 # Cleaning is done on three levels.
 # make clean     Delete most generated files
@@ -1849,7 +1866,7 @@ clean: $(clean-dirs)
 		-o -name '*.lex.c' -o -name '*.tab.[ch]' \
 		-o -name '*.asn1.[ch]' \
 		-o -name '*.symtypes' -o -name 'modules.order' \
-		-o -name '.tmp_*.o.*' \
+		-o -name '.tmp_*.o.*' -o -name modules_thick.builtin \
 		-o -name '*.c.[012]*.*' \
 		-o -name '*.ll' \
 		-o -name '*.gcno' \
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 3514c2149e9d..167bbbd5fdf5 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -74,6 +74,12 @@ endef
 # $(Q)$(MAKE) $(build)=dir
 build := -f $(srctree)/scripts/Makefile.build obj
 
+###
+# Shorthand for $(Q)$(MAKE) -f scripts/Makefile.modbuiltin obj=
+# Usage:
+# $(Q)$(MAKE) $(modbuiltin)=dir
+modbuiltin := -f $(srctree)/scripts/Makefile.modbuiltin obj
+
 ###
 # Shorthand for $(Q)$(MAKE) -f scripts/Makefile.dtbinst obj=
 # Usage:
diff --git a/scripts/Makefile.modbuiltin b/scripts/Makefile.modbuiltin
new file mode 100644
index 000000000000..a27b692ea795
--- /dev/null
+++ b/scripts/Makefile.modbuiltin
@@ -0,0 +1,56 @@
+# SPDX-License-Identifier: GPL-2.0
+# ==========================================================================
+# Generating modules_thick.builtin
+# ==========================================================================
+
+src := $(obj)
+
+PHONY := __modbuiltin
+__modbuiltin:
+
+include include/config/auto.conf
+# tristate.conf sets tristate variables to uppercase 'Y' or 'M'
+# That way, we get the list of built-in modules in obj-Y
+include include/config/tristate.conf
+
+include scripts/Kbuild.include
+
+ifdef building_out_of_srctree
+# Create output directory if not already present
+_dummy := $(shell [ -d $(obj) ] || mkdir -p $(obj))
+endif
+
+# The filename Kbuild has precedence over Makefile
+kbuild-dir := $(if $(filter /%,$(src)),$(src),$(srctree)/$(src))
+kbuild-file := $(if $(wildcard $(kbuild-dir)/Kbuild),$(kbuild-dir)/Kbuild,$(kbuild-dir)/Makefile)
+include $(kbuild-file)
+
+include scripts/Makefile.lib
+
+modthickbuiltin-subdirs := $(patsubst %,%/modules_thick.builtin, $(subdir-ym))
+modthickbuiltin-target  := $(obj)/modules_thick.builtin
+
+__modbuiltin: $(obj)/$(builtin-file) $(subdir-ym)
+	@:
+
+$(modthickbuiltin-target): $(subdir-ym) FORCE
+	$(Q) rm -f $@
+	$(Q) $(foreach mod-o, $(filter %.o,$(obj-Y)),\
+		printf "%s:" $(addprefix $(obj)/,$(mod-o)) >> $@; \
+		printf " %s" $(sort $(strip $(addprefix $(obj)/,$($(mod-o:.o=-objs)) \
+			$($(mod-o:.o=-y)) $($(mod-o:.o=-Y))))) >> $@; \
+		printf "\n" >> $@; ) \
+	cat /dev/null $(modthickbuiltin-subdirs) >> $@;
+
+PHONY += FORCE
+
+FORCE:
+
+# Descending
+# ---------------------------------------------------------------------------
+
+PHONY += $(subdir-ym)
+$(subdir-ym):
+	$(Q)$(MAKE) $(modbuiltin)=$@ builtin-file=$(builtin-file)
+
+.PHONY: $(PHONY)
diff --git a/scripts/modules_thick.c b/scripts/modules_thick.c
new file mode 100644
index 000000000000..9a15e99c1330
--- /dev/null
+++ b/scripts/modules_thick.c
@@ -0,0 +1,200 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * A simple modules_thick reader.
+ *
+ * (C) 2014, 2021 Oracle, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "modules_thick.h"
+
+/*
+ * Read a modules_thick.builtin file and translate it into a stream of
+ * name / module-name pairs.
+ */
+
+/*
+ * Construct a modules_thick.builtin iterator.
+ */
+struct modules_thick_iter *
+modules_thick_iter_new(const char *modules_thick_file)
+{
+	struct modules_thick_iter *i;
+
+	i = calloc(1, sizeof(struct modules_thick_iter));
+	if (i == NULL)
+		return NULL;
+
+	i->f = fopen(modules_thick_file, "r");
+
+	if (i->f == NULL) {
+		fprintf(stderr, "Cannot open builtin module file %s: %s\n",
+			modules_thick_file, strerror(errno));
+		return NULL;
+	}
+
+	return i;
+}
+
+/*
+ * Iterate, returning a new null-terminated array of object file names, and a
+ * new dynamically-allocated module name.  (The module name passed in is freed.)
+ *
+ * The array of object file names should be freed by the caller: the strings it
+ * points to are owned by the iterator, and should not be freed.
+ */
+
+char ** __attribute__((__nonnull__))
+modules_thick_iter_next(struct modules_thick_iter *i, char **module_name)
+{
+	size_t npaths = 1;
+	char **module_paths;
+	char *last_slash;
+	char *last_dot;
+	char *trailing_linefeed;
+	char *object_name = i->line;
+	char *dash;
+	int composite = 0;
+
+	/*
+	 * Read in all module entries, computing the suffixless, pathless name
+	 * of the module and building the next arrayful of object file names for
+	 * return.
+	 *
+	 * Modules can consist of multiple files: in this case, the portion
+	 * before the colon is the path to the module (as before): the portion
+	 * after the colon is a space-separated list of files that should be
+	 * considered part of this module.  In this case, the portion before the
+	 * name is an "object file" that does not actually exist: it is merged
+	 * into built-in.a without ever being written out.
+	 *
+	 * All module names have - translated to _, to match what is done to the
+	 * names of the same things when built as modules.
+	 */
+
+	/*
+	 * Reinvocation of exhausted iterator. Return NULL, once.
+	 */
+retry:
+	if (getline(&i->line, &i->line_size, i->f) < 0) {
+		if (ferror(i->f)) {
+			fprintf(stderr, "Error reading from modules_thick file:"
+				" %s\n", strerror(errno));
+			exit(1);
+		}
+		rewind(i->f);
+		return NULL;
+	}
+
+	if (i->line[0] == '\0')
+		goto retry;
+
+	/*
+	 * Slice the line in two at the colon, if any.  If there is anything
+	 * past the ': ', this is a composite module.  (We allow for no colon
+	 * for robustness, even though one should always be present.)
+	 */
+	if (strchr(i->line, ':') != NULL) {
+		char *name_start;
+
+		object_name = strchr(i->line, ':');
+		*object_name = '\0';
+		object_name++;
+		name_start = object_name + strspn(object_name, " \n");
+		if (*name_start != '\0') {
+			composite = 1;
+			object_name = name_start;
+		}
+	}
+
+	/*
+	 * Figure out the module name.
+	 */
+	last_slash = strrchr(i->line, '/');
+	last_slash = (!last_slash) ? i->line :
+		last_slash + 1;
+	free(*module_name);
+	*module_name = strdup(last_slash);
+	dash = *module_name;
+
+	while (dash != NULL) {
+		dash = strchr(dash, '-');
+		if (dash != NULL)
+			*dash = '_';
+	}
+
+	last_dot = strrchr(*module_name, '.');
+	if (last_dot != NULL)
+		*last_dot = '\0';
+
+	trailing_linefeed = strchr(object_name, '\n');
+	if (trailing_linefeed != NULL)
+		*trailing_linefeed = '\0';
+
+	/*
+	 * Multifile separator? Object file names explicitly stated:
+	 * slice them up and shuffle them in.
+	 *
+	 * The array size may be an overestimate if any object file
+	 * names start or end with spaces (very unlikely) but cannot be
+	 * an underestimate.  (Check for it anyway.)
+	 */
+	if (composite) {
+		char *one_object;
+
+		for (npaths = 0, one_object = object_name;
+		     one_object != NULL;
+		     npaths++, one_object = strchr(one_object + 1, ' '));
+	}
+
+	module_paths = malloc((npaths + 1) * sizeof(char *));
+	if (!module_paths) {
+		fprintf(stderr, "%s: out of memory on module %s\n", __func__,
+			*module_name);
+		exit(1);
+	}
+
+	if (composite) {
+		char *one_object;
+		size_t i = 0;
+
+		while ((one_object = strsep(&object_name, " ")) != NULL) {
+			if (i >= npaths) {
+				fprintf(stderr, "%s: num_objs overflow on module "
+					"%s: this is a bug.\n", __func__,
+					*module_name);
+				exit(1);
+			}
+
+			module_paths[i++] = one_object;
+		}
+	} else
+		module_paths[0] = i->line;	/* untransformed module name */
+
+	module_paths[npaths] = NULL;
+
+	return module_paths;
+}
+
+/*
+ * Free an iterator. Can be called while iteration is underway, so even
+ * state that is freed at the end of iteration must be freed here too.
+ */
+void
+modules_thick_iter_free(struct modules_thick_iter *i)
+{
+	if (i == NULL)
+		return;
+	fclose(i->f);
+	free(i->line);
+	free(i);
+}
diff --git a/scripts/modules_thick.h b/scripts/modules_thick.h
new file mode 100644
index 000000000000..f5edcaf9550c
--- /dev/null
+++ b/scripts/modules_thick.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * A simple modules_thick reader.
+ *
+ * (C) 2014, 2021 Oracle, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _LINUX_MODULES_THICK_H
+#define _LINUX_MODULES_THICK_H
+
+#include <stdio.h>
+#include <stddef.h>
+
+/*
+ * modules_thick.builtin iteration state.
+ */
+struct modules_thick_iter {
+	FILE *f;
+	char *line;
+	size_t line_size;
+};
+
+/*
+ * Construct a modules_thick.builtin iterator.
+ */
+struct modules_thick_iter *
+modules_thick_iter_new(const char *modules_thick_file);
+
+/*
+ * Iterate, returning a new null-terminated array of object file names, and a
+ * new dynamically-allocated module name.  (The module name passed in is freed.)
+ *
+ * The array of object file names should be freed by the caller: the strings it
+ * points to are owned by the iterator, and should not be freed.
+ */
+
+char ** __attribute__((__nonnull__))
+modules_thick_iter_next(struct modules_thick_iter *i, char **module_name);
+
+void
+modules_thick_iter_free(struct modules_thick_iter *i);
+
+#endif
-- 
2.35.0.260.gb82b153193.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 3/6] kbuild: generate an address ranges map at vmlinux link time
  2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
  2022-02-08 18:43 ` [PATCH v8 1/6] kbuild: bring back tristate.conf Nick Alcock
  2022-02-08 18:43 ` [PATCH v8 2/6] kbuild: add modules_thick.builtin Nick Alcock
@ 2022-02-08 18:43 ` Nick Alcock
  2022-02-08 18:43 ` [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules Nick Alcock
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Nick Alcock @ 2022-02-08 18:43 UTC (permalink / raw)
  To: mcgrof, masahiroy
  Cc: jolsa, rostedt, bas, tglozar, Ast-x64, viktor.malik, dxu, acme,
	adrian.hunter, ak, irogers, linux-kbuild, linux-modules,
	linux-kernel, arnd, akpm, eugene.loh, kris.van.hees

This emits a new file, .tmp_vmlinux.ranges, which maps address
range/size pairs in vmlinux to the object files which make them up,
e.g., in part:

0x0000000000000000 0x30 arch/x86/kernel/cpu/common.o
0x0000000000001000 0x1000 arch/x86/events/intel/ds.o
0x0000000000002000 0x4000 arch/x86/kernel/irq_64.o
0x0000000000006000 0x5000 arch/x86/kernel/process.o
0x000000000000b000 0x1000 arch/x86/kernel/cpu/common.o
0x000000000000c000 0x5000 arch/x86/mm/cpu_entry_area.o
0x0000000000011000 0x10 arch/x86/kernel/espfix_64.o
0x0000000000011010 0x2 arch/x86/kernel/cpu/common.o
[...]

In my simple tests this seems to work with clang too, but if I'm not
sure how stable the format of clang's linker mapfiles is: if it turns
out not to work in some versions, the mapfile-massaging awk script added
here might need some adjustment.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
---

Notes:
    v6: use ${wl} where appropriate to avoid failure on UML

 scripts/link-vmlinux.sh | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 666f7bbc13eb..981cd441ca21 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -203,7 +203,7 @@ vmlinux_link()
 	${ld} ${ldflags} -o ${output}					\
 		${wl}--whole-archive ${objs} ${wl}--no-whole-archive	\
 		${wl}--start-group ${libs} ${wl}--end-group		\
-		$@ ${ldlibs}
+		${wl}-Map=.tmp_vmlinux.map $@ ${ldlibs}
 }
 
 # generate .BTF typeinfo from DWARF debuginfo
@@ -246,6 +246,19 @@ kallsyms()
 {
 	local kallsymopt;
 
+	# read the linker map to identify ranges of addresses:
+	#   - for each *.o file, report address, size, pathname
+	#       - most such lines will have four fields
+	#       - but sometimes there is a line break after the first field
+	#   - start reading at "Linker script and memory map"
+	#   - stop reading at ".brk"
+	${AWK} '
+	    /\.o$/ && start==1 { print $(NF-2), $(NF-1), $NF }
+	    /^Linker script and memory map/ { start = 1 }
+	    /^\.brk/ { exit(0) }
+	' .tmp_vmlinux.map | sort > .tmp_vmlinux.ranges
+
+	# get kallsyms options
 	if is_enabled CONFIG_KALLSYMS_ALL; then
 		kallsymopt="${kallsymopt} --all-symbols"
 	fi
-- 
2.35.0.260.gb82b153193.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules
  2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
                   ` (2 preceding siblings ...)
  2022-02-08 18:43 ` [PATCH v8 3/6] kbuild: generate an address ranges map at vmlinux link time Nick Alcock
@ 2022-02-08 18:43 ` Nick Alcock
  2022-02-10  0:48   ` Masahiro Yamada
  2022-02-08 18:43 ` [PATCH v8 5/6] kallsyms: optimize .kallsyms_modules* Nick Alcock
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Nick Alcock @ 2022-02-08 18:43 UTC (permalink / raw)
  To: mcgrof, masahiroy
  Cc: jolsa, rostedt, bas, tglozar, Ast-x64, viktor.malik, dxu, acme,
	adrian.hunter, ak, irogers, linux-kbuild, linux-modules,
	linux-kernel, arnd, akpm, eugene.loh, kris.van.hees

The mapping consists of three new symbols, computed by integrating the
information in the (just-added) .tmp_vmlinux.ranges and
modules_thick.builtin: taken together, they map address ranges
(corresponding to object files on the input) to the names of zero or
more modules containing those address ranges.

 - kallsyms_module_addresses/kallsyms_module_offsets encodes the
   address/offset of each object file (derived from the linker map), in
   exactly the same way as kallsyms_addresses/kallsyms_offsets does
   for symbols.  There is no size: instead, the object files are assumed
   to tile the address space.  (This is slightly more space-efficient
   than using a size).  Non-text-section addresses are skipped: for now,
   all the users of this interface only need module/non-module
   information for instruction pointer addresses, not absolute-addressed
   symbols and the like.  This restriction can easily be lifted in
   future.  (Regarding the name: right now the entries correspond pretty
   closely to object files, so we could call the section
   kallsyms_objfiles or something, but the optimizer added in the next
   commit will change this.)

 - kallsyms_module_names encodes the name of each module in a modified
   form of strtab: notably, if an object file appears in *multiple*
   modules, all of which are built in, this is encoded via a zero byte,
   a one-byte module count, then a series of that many null-terminated
   strings.  As a special case, the table starts with a single zero byte
   which does *not* represent the start of a multi-module list.

 - kallsyms_modules connects the two, encoding a table associated 1:1
   with kallsyms_module_addresses / kallsyms_module_offsets, pointing
   at an offset in kallsyms_module_names describing which module (or
   modules, for a multi-module list) the code occupying this address
   range is part of.  If an address range is part of no module (always
   built-in) it points at 0 (the null byte at the start of the
   kallsyms_module_names list).

There is no optimization yet: kallsyms_modules and
kallsyms_module_names will almost certainly contain many duplicate
entries, and kallsyms_module_{addresses,offsets} may contain
consecutive entries that point to the same place.  The size hit is
fairly substantial as a result, though still much less than a naive
implementation mapping each symbol to a module name would be: 50KiB or
so.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
---
 Makefile           |   2 +-
 init/Kconfig       |   8 +
 scripts/Makefile   |   6 +
 scripts/kallsyms.c | 366 +++++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 371 insertions(+), 11 deletions(-)

diff --git a/Makefile b/Makefile
index 5e823fe8390f..b719244cb571 100644
--- a/Makefile
+++ b/Makefile
@@ -1151,7 +1151,7 @@ cmd_link-vmlinux =                                                 \
 	$(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
 	$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
 
-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
+vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) modules_thick.builtin FORCE
 	+$(call if_changed_dep,link-vmlinux)
 
 targets := vmlinux
diff --git a/init/Kconfig b/init/Kconfig
index e9119bf54b1f..e1ca3d70cb1c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1530,6 +1530,14 @@ config POSIX_TIMERS
 
 	  If unsure say y.
 
+config KALLMODSYMS
+	default y
+	bool "Enable support for /proc/kallmodsyms" if EXPERT
+	depends on KALLSYMS
+	help
+	  This option enables the /proc/kallmodsyms file, which maps symbols
+	  to addresses and their associated modules.
+
 config PRINTK
 	default y
 	bool "Enable support for printk" if EXPERT
diff --git a/scripts/Makefile b/scripts/Makefile
index ce5aa9030b74..c5cc4ac3d660 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -29,6 +29,12 @@ ifdef CONFIG_BUILDTIME_MCOUNT_SORT
 HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED
 endif
 
+kallsyms-objs  := kallsyms.o
+
+ifdef CONFIG_KALLMODSYMS
+kallsyms-objs += modules_thick.o
+endif
+
 # The following programs are only built on demand
 hostprogs += unifdef
 
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 54ad86d13784..8f87b724d0fa 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -5,7 +5,10 @@
  * This software may be used and distributed according to the terms
  * of the GNU General Public License, incorporated herein by reference.
  *
- * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
+ * Usage: nm -n vmlinux
+ *        | scripts/kallsyms [--all-symbols] [--absolute-percpu]
+ *             [--base-relative] [--builtin=modules_thick.builtin]
+ *        > symbols.S
  *
  *      Table compression uses all the unused char codes on the symbols and
  *  maps these to the most used substrings (tokens). For instance, it might
@@ -24,6 +27,10 @@
 #include <string.h>
 #include <ctype.h>
 #include <limits.h>
+#include <assert.h>
+#include "modules_thick.h"
+
+#include "../include/generated/autoconf.h"
 
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
 
@@ -67,11 +74,116 @@ static int token_profit[0x10000];
 static unsigned char best_table[256][2];
 static unsigned char best_table_len[256];
 
+#ifdef CONFIG_KALLMODSYMS
+static unsigned int strhash(const char *s)
+{
+	/* fnv32 hash */
+	unsigned int hash = 2166136261U;
+
+	for (; *s; s++)
+		hash = (hash ^ *s) * 0x01000193;
+	return hash;
+}
+
+#define OBJ2MOD_BITS 10
+#define OBJ2MOD_N (1 << OBJ2MOD_BITS)
+#define OBJ2MOD_MASK (OBJ2MOD_N - 1)
+struct obj2mod_elem {
+	char *obj;
+	char *mods;			/* sorted module name strtab */
+	size_t nmods;			/* number of modules in "mods" */
+	size_t mods_size;		/* size of all mods together */
+	int mod_offset;			/* offset in .kallsyms_module_names */
+	struct obj2mod_elem *obj2mod_next;
+};
+
+/*
+ * Map from object files to obj2mod entries (a unique mapping).
+ */
+
+static struct obj2mod_elem *obj2mod[OBJ2MOD_N];
+static size_t num_objfiles;
+
+/*
+ * An ordered list of address ranges and the objfile that occupies that range.
+ */
+struct addrmap_entry {
+	unsigned long long addr;
+	struct obj2mod_elem *objfile;
+};
+static struct addrmap_entry *addrmap;
+static int addrmap_num, addrmap_alloced;
+
+static void obj2mod_init(void)
+{
+	memset(obj2mod, 0, sizeof(obj2mod));
+}
+
+static struct obj2mod_elem *obj2mod_get(const char *obj)
+{
+	int i = strhash(obj) & OBJ2MOD_MASK;
+	struct obj2mod_elem *elem;
+
+	for (elem = obj2mod[i]; elem; elem = elem->obj2mod_next) {
+		if (strcmp(elem->obj, obj) == 0)
+			return elem;
+	}
+	return NULL;
+}
+
+/*
+ * Note that a given object file is found in some module, interning it in the
+ * obj2mod hash.  Should not be called more than once for any given (module,
+ * object) pair.
+ */
+static void obj2mod_add(char *obj, char *mod)
+{
+	int i = strhash(obj) & OBJ2MOD_MASK;
+	struct obj2mod_elem *elem;
+
+	elem = obj2mod_get(obj);
+	if (!elem) {
+		elem = malloc(sizeof(struct obj2mod_elem));
+		if (!elem)
+			goto oom;
+		memset(elem, 0, sizeof(struct obj2mod_elem));
+		elem->obj = strdup(obj);
+		if (!elem->obj)
+			goto oom;
+		elem->mods = strdup(mod);
+		if (!elem->mods)
+			goto oom;
+
+		elem->obj2mod_next = obj2mod[i];
+		obj2mod[i] = elem;
+		num_objfiles++;
+	} else {
+		elem->mods = realloc(elem->mods, elem->mods_size +
+				     strlen(mod) + 1);
+		if (!elem->mods)
+			goto oom;
+		strcpy(elem->mods + elem->mods_size, mod);
+	}
+
+	elem->mods_size += strlen(mod) + 1;
+	elem->nmods++;
+	if (elem->nmods > 255) {
+		fprintf(stderr, "kallsyms: %s: too many modules associated with this object file\n",
+			obj);
+		exit(EXIT_FAILURE);
+	}
+	return;
+oom:
+	fprintf(stderr, "kallsyms: out of memory\n");
+	exit(1);
+}
+#endif /* CONFIG_KALLMODSYMS */
 
 static void usage(void)
 {
-	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
-			"[--base-relative] < in.map > out.S\n");
+	fprintf(stderr, "Usage: kallsyms [--all-symbols] [--absolute-percpu] "
+			"[--base-relative] [--builtin=modules_thick.builtin] "
+			"< nm_vmlinux.out > symbols.S\n");
 	exit(1);
 }
 
@@ -95,10 +207,16 @@ static bool is_ignored_symbol(const char *name, char type)
 		"kallsyms_offsets",
 		"kallsyms_relative_base",
 		"kallsyms_num_syms",
+		"kallsyms_num_modules",
 		"kallsyms_names",
 		"kallsyms_markers",
 		"kallsyms_token_table",
 		"kallsyms_token_index",
+		"kallsyms_module_offsets",
+		"kallsyms_module_addresses",
+		"kallsyms_modules",
+		"kallsyms_module_names",
+		"kallsyms_module_names_len",
 		/* Exclude linker generated symbols which vary between passes */
 		"_SDA_BASE_",		/* ppc */
 		"_SDA2_BASE_",		/* ppc */
@@ -246,8 +364,8 @@ static struct sym_entry *read_symbol(FILE *in)
 	return sym;
 }
 
-static int symbol_in_range(const struct sym_entry *s,
-			   const struct addr_range *ranges, int entries)
+static int addr_in_range(unsigned long long addr,
+			 const struct addr_range *ranges, int entries)
 {
 	size_t i;
 	const struct addr_range *ar;
@@ -255,7 +373,7 @@ static int symbol_in_range(const struct sym_entry *s,
 	for (i = 0; i < entries; ++i) {
 		ar = &ranges[i];
 
-		if (s->addr >= ar->start && s->addr <= ar->end)
+		if (addr >= ar->start && addr <= ar->end)
 			return 1;
 	}
 
@@ -269,8 +387,8 @@ static int symbol_valid(const struct sym_entry *s)
 	/* if --all-symbols is not specified, then symbols outside the text
 	 * and inittext sections are discarded */
 	if (!all_symbols) {
-		if (symbol_in_range(s, text_ranges,
-				    ARRAY_SIZE(text_ranges)) == 0)
+		if (addr_in_range(s->addr, text_ranges,
+				  ARRAY_SIZE(text_ranges)) == 0)
 			return 0;
 		/* Corner case.  Discard any symbols with the same value as
 		 * _etext _einittext; they can move between pass 1 and 2 when
@@ -352,6 +470,121 @@ static void output_address(unsigned long long addr)
 		printf("\tPTR\t_text - %#llx\n", _text - addr);
 }
 
+#ifdef CONFIG_KALLMODSYMS
+/* Output the .kallmodsyms_modules symbol content. */
+static void output_kallmodsyms_modules(void)
+{
+	struct obj2mod_elem *elem;
+	size_t offset = 1;
+	size_t i;
+
+	/*
+	 * Traverse and emit, updating mod_offset accordingly.
+	 * Emit a single \0 at the start, to encode non-modular objfiles.
+	 */
+	output_label("kallsyms_module_names");
+	printf("\t.byte\t0\n");
+	for (i = 0; i < OBJ2MOD_N; i++) {
+		for (elem = obj2mod[i]; elem;
+		     elem = elem->obj2mod_next) {
+			const char *onemod;
+			size_t i;
+
+			elem->mod_offset = offset;
+			onemod = elem->mods;
+
+			/*
+			 * Technically this is a waste of space: we could just
+			 * as well implement multimodule entries by pointing one
+			 * byte further back, to the trailing \0 of the previous
+			 * entry, but doing it this way makes it more obvious
+			 * when an entry is a multimodule entry.
+			 */
+			if (elem->nmods != 1) {
+				printf("\t.byte\t0\n");
+				printf("\t.byte\t%zi\n", elem->nmods);
+				offset += 2;
+			}
+
+			for (i = elem->nmods; i > 0; i--) {
+				printf("\t.asciz\t\"%s\"\n", onemod);
+				offset += strlen(onemod) + 1;
+				onemod += strlen(onemod) + 1;
+			}
+		}
+	}
+	printf("\n");
+	output_label("kallsyms_module_names_len");
+	printf("\t.long\t%zi\n", offset);
+}
+
+static void output_kallmodsyms_objfiles(void)
+{
+	size_t i = 0;
+	size_t emitted_offsets = 0;
+	size_t emitted_objfiles = 0;
+
+	if (base_relative)
+		output_label("kallsyms_module_offsets");
+	else
+		output_label("kallsyms_module_addresses");
+
+	for (i = 0; i < addrmap_num; i++) {
+		long long offset;
+		int overflow;
+
+		if (base_relative) {
+			if (!absolute_percpu) {
+				offset = addrmap[i].addr - relative_base;
+				overflow = (offset < 0 || offset > UINT_MAX);
+			} else {
+				offset = relative_base - addrmap[i].addr - 1;
+				overflow = (offset < INT_MIN || offset >= 0);
+			}
+			if (overflow) {
+				fprintf(stderr, "kallsyms failure: "
+					"objfile %s at address %#llx out of range in relative mode\n",
+					addrmap[i].objfile ? addrmap[i].objfile->obj :
+					"in always-built-in object", table[i]->addr);
+				exit(EXIT_FAILURE);
+			}
+			printf("\t.long\t0x%x\n", (int)offset);
+		} else
+			printf("\tPTR\t%#llx\n", addrmap[i].addr);
+		emitted_offsets++;
+	}
+
+	output_label("kallsyms_modules");
+
+	for (i = 0; i < addrmap_num; i++) {
+		struct obj2mod_elem *elem = addrmap[i].objfile;
+		/*
+		 * Address range cites no object file: point at 0, the built-in
+		 * module.
+		 */
+		if (addrmap[i].objfile == NULL) {
+			printf("\t.long\t0x0\n");
+			emitted_objfiles++;
+			continue;
+		}
+
+		/*
+		 * Zero offset is the initial \0, there to catch uninitialized
+		 * obj2mod entries, and is forbidden.
+		 */
+		assert (elem->mod_offset != 0);
+
+		printf("\t.long\t0x%x\n", elem->mod_offset);
+		emitted_objfiles++;
+	}
+
+	assert (emitted_offsets == emitted_objfiles);
+	output_label("kallsyms_num_modules");
+	printf("\t.long\t%zi\n", emitted_objfiles);
+	printf("\n");
+}
+#endif /* CONFIG_KALLMODSYMS */
+
 /* uncompress a compressed symbol. When this function is called, the best table
  * might still be compressed itself, so the function needs to be recursive */
 static int expand_symbol(const unsigned char *data, int len, char *result)
@@ -451,6 +684,11 @@ static void write_src(void)
 		printf("\n");
 	}
 
+#ifdef CONFIG_KALLMODSYMS
+	output_kallmodsyms_modules();
+	output_kallmodsyms_objfiles();
+#endif
+
 	output_label("kallsyms_num_syms");
 	printf("\t.long\t%u\n", table_cnt);
 	printf("\n");
@@ -735,7 +973,7 @@ static void make_percpus_absolute(void)
 	unsigned int i;
 
 	for (i = 0; i < table_cnt; i++)
-		if (symbol_in_range(table[i], &percpu_range, 1)) {
+		if (addr_in_range(table[i]->addr, &percpu_range, 1)) {
 			/*
 			 * Keep the 'A' override for percpu symbols to
 			 * ensure consistent behavior compared to older
@@ -762,17 +1000,124 @@ static void record_relative_base(void)
 		}
 }
 
+#ifdef CONFIG_KALLMODSYMS
+/*
+ * Read the linker map.
+ */
+static void read_linker_map(void)
+{
+	unsigned long long addr, size;
+	char obj[PATH_MAX+1];
+	FILE *f = fopen(".tmp_vmlinux.ranges", "r");
+
+	if (!f) {
+		fprintf(stderr, "Cannot open '.tmp_vmlinux.ranges'.\n");
+		exit(1);
+	}
+
+	addrmap_num = 0;
+	addrmap_alloced = 4096;
+	addrmap = malloc(sizeof(*addrmap) * addrmap_alloced);
+	if (!addrmap)
+		goto oom;
+
+	/*
+	 * For each address range, add to addrmap the address and the objfile
+	 * entry to which the range maps.  Only add entries relating to text
+	 * ranges.  (We assume that the text ranges are tightly packed, because
+	 * in any reasonable object file format they will be, so we can ignore
+	 * the size.)
+	 *
+	 * Ranges that do not correspond to a built-in module, but to an
+	 * always-built-in object file, have no obj2mod_elem and point at NULL
+	 * instead.
+	 */
+
+	while (fscanf(f, "%llx %llx %s\n", &addr, &size, obj) == 3) {
+		struct obj2mod_elem *elem = obj2mod_get(obj);
+
+		if (addr == 0 || size == 0 ||
+		    !addr_in_range(addr, text_ranges, ARRAY_SIZE(text_ranges)))
+			continue;
+
+		if (addrmap_num >= addrmap_alloced) {
+			addrmap_alloced *= 2;
+			addrmap = realloc(addrmap,
+			    sizeof(*addrmap) * addrmap_alloced);
+			if (!addrmap)
+				goto oom;
+		}
+
+                addrmap[addrmap_num].addr = addr;
+                addrmap[addrmap_num].objfile = elem;
+		addrmap_num++;
+	}
+	fclose(f);
+	return;
+
+oom:
+	fprintf(stderr, "kallsyms: out of memory\n");
+	exit(1);
+}
+
+/*
+ * Read "modules_thick.builtin" (the list of built-in modules).  Construct the
+ * obj2mod hash to track objfile -> module mappings.  Read ".tmp_vmlinux.ranges"
+ * (the linker map) and build addrmap[], which maps address ranges to built-in
+ * module names (using obj2mod).
+ */
+static void read_modules(const char *modules_builtin)
+{
+	struct modules_thick_iter *i;
+	char *module_name = NULL;
+	char **module_paths;
+
+	obj2mod_init();
+	/*
+	 * Iterate over all modules in modules_thick.builtin and add each.
+	 */
+	i = modules_thick_iter_new(modules_builtin);
+	if (i == NULL) {
+		fprintf(stderr, "Cannot iterate over builtin modules.\n");
+		exit(1);
+	}
+
+	while ((module_paths = modules_thick_iter_next(i, &module_name))) {
+		char **walk = module_paths;
+		while (*walk) {
+			obj2mod_add(*walk, module_name);
+			walk++;
+		}
+		free(module_paths);
+	}
+
+	free(module_name);
+	modules_thick_iter_free(i);
+
+	/*
+	 * Read linker map.
+	 */
+	read_linker_map();
+}
+#else
+static void read_modules(const char *unused) {}
+#endif /* CONFIG_KALLMODSYMS */
+
 int main(int argc, char **argv)
 {
+	const char *modules_builtin = "modules_thick.builtin";
+
 	if (argc >= 2) {
 		int i;
 		for (i = 1; i < argc; i++) {
-			if(strcmp(argv[i], "--all-symbols") == 0)
+			if (strcmp(argv[i], "--all-symbols") == 0)
 				all_symbols = 1;
 			else if (strcmp(argv[i], "--absolute-percpu") == 0)
 				absolute_percpu = 1;
 			else if (strcmp(argv[i], "--base-relative") == 0)
 				base_relative = 1;
+			else if (strncmp(argv[i], "--builtin=", 10) == 0)
+				modules_builtin = &argv[i][10];
 			else
 				usage();
 		}
@@ -780,6 +1125,7 @@ int main(int argc, char **argv)
 		usage();
 
 	read_map(stdin);
+	read_modules(modules_builtin);
 	shrink_table();
 	if (absolute_percpu)
 		make_percpus_absolute();
-- 
2.35.0.260.gb82b153193.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 5/6] kallsyms: optimize .kallsyms_modules*
  2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
                   ` (3 preceding siblings ...)
  2022-02-08 18:43 ` [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules Nick Alcock
@ 2022-02-08 18:43 ` Nick Alcock
  2022-02-08 18:43 ` [PATCH v8 6/6] kallsyms: add /proc/kallmodsyms Nick Alcock
  2022-02-11 22:36 ` [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Jiri Olsa
  6 siblings, 0 replies; 12+ messages in thread
From: Nick Alcock @ 2022-02-08 18:43 UTC (permalink / raw)
  To: mcgrof, masahiroy
  Cc: jolsa, rostedt, bas, tglozar, Ast-x64, viktor.malik, dxu, acme,
	adrian.hunter, ak, irogers, linux-kbuild, linux-modules,
	linux-kernel, arnd, akpm, eugene.loh, kris.van.hees

These symbols are terribly inefficiently stored at the moment.  Add a
simple optimizer which fuses obj2mod_elem entries and uses this to
implement three cheap optimizations:

 - duplicate names are eliminated from .kallsyms_module_names.

 - entries in .kallsyms_modules which point at single-file modules which
   also appear in a multi-module list are redirected to point inside
   that list, and the single-file entry is dropped from
   .kallsyms_module_names.  Thus, modules which contain some object
   files shared with other modules and some object files exclusive to
   them do not double up the module name.  (There might still be some
   duplication between multiple multi-module lists, but this is an
   extremely marginal size effect, and resolving it would require an
   extra layer of lookup tables which would be even more complex, and
   incompressible to boot).

 - Entries in .kallsyms_modules that would contain the same value after
   the above optimizations are fused together, along with their
   corresponding .kallsyms_module_addresses/offsets entries.  Due to
   this fusion process, and because object files can be split apart into
   multiple parts by the linker for hot/cold partitioning and the like,
   entries in .kallsyms_module_addresses/offsets no longer correspond
   1:1 to object files, but more to some contiguous range of addresses
   which are guaranteed to belong to a single built-in module, but which
   may well stretch over multiple object files.

The optimizer's time complexity is O(log n) in the number of objfiles at
most (and probably much lower), so, given the relatively low number of
objfiles, its runtime overhead is in the noise.

Optimization reduces the overhead of the kallmodsyms tables by about
7500 items, dropping the .tmp_kallsyms2.o object file size by about
33KiB, leaving it 8672 bytes larger than before: a gain of .4%.

The vmlinux size is not yet affected because the variables are not used
and are eliminated by the linker: but if they were used (after the next
commit), the size impact of all of this on the final kernel is minimal:
in my testing, vmlinux grew by 0.17% (10824 bytes), and the compressed
vmlinux only grew by 0.08% (7552 bytes): though this is very
configuration-dependent, it seems likely to scale roughly with the
kernel as a whole.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
---
 scripts/kallsyms.c | 267 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 258 insertions(+), 9 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 8f87b724d0fa..93fdf0dcf587 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -85,6 +85,17 @@ static unsigned int strhash(const char *s)
 	return hash;
 }
 
+static unsigned int memhash(char *s, size_t len)
+{
+	/* fnv32 hash */
+	unsigned int hash = 2166136261U;
+	size_t i;
+
+	for (i = 0; i < len; i++)
+		hash = (hash ^ *(s + i)) * 0x01000193;
+	return hash;
+}
+
 #define OBJ2MOD_BITS 10
 #define OBJ2MOD_N (1 << OBJ2MOD_BITS)
 #define OBJ2MOD_MASK (OBJ2MOD_N - 1)
@@ -94,14 +105,24 @@ struct obj2mod_elem {
 	size_t nmods;			/* number of modules in "mods" */
 	size_t mods_size;		/* size of all mods together */
 	int mod_offset;			/* offset in .kallsyms_module_names */
+	/*
+	 * If set at emission time, this points at another obj2mod entry that
+	 * contains the module name we need (possibly at a slightly later
+	 * offset, if the entry is for an objfile that appears in many modules).
+	 */
+	struct obj2mod_elem *xref;
 	struct obj2mod_elem *obj2mod_next;
+	struct obj2mod_elem *mod2obj_next;
 };
 
 /*
- * Map from object files to obj2mod entries (a unique mapping).
+ * Map from object files to obj2mod entries (a unique mapping), and vice versa
+ * (not unique, but entries for objfiles in more than one module in this hash
+ * are ignored).
  */
 
 static struct obj2mod_elem *obj2mod[OBJ2MOD_N];
+static struct obj2mod_elem *mod2obj[OBJ2MOD_N];
 static size_t num_objfiles;
 
 /*
@@ -143,6 +164,8 @@ static void obj2mod_add(char *obj, char *mod)
 
 	elem = obj2mod_get(obj);
 	if (!elem) {
+		int j = strhash(mod) & OBJ2MOD_MASK;
+
 		elem = malloc(sizeof(struct obj2mod_elem));
 		if (!elem)
 			goto oom;
@@ -156,8 +179,15 @@ static void obj2mod_add(char *obj, char *mod)
 
 		elem->obj2mod_next = obj2mod[i];
 		obj2mod[i] = elem;
+		elem->mod2obj_next = mod2obj[j];
+		mod2obj[j] = elem;
 		num_objfiles++;
 	} else {
+		/*
+		 * TU appears in multiple modules.  mod2obj for this entry will
+		 * be ignored from now on, except insofar as it is needed to
+		 * maintain the hash chain.
+		 */
 		elem->mods = realloc(elem->mods, elem->mods_size +
 				     strlen(mod) + 1);
 		if (!elem->mods)
@@ -177,6 +207,164 @@ static void obj2mod_add(char *obj, char *mod)
 	fprintf(stderr, "kallsyms: out of memory\n");
 	exit(1);
 }
+
+/*
+ * Used inside optimize_obj2mod to identify duplicate module entries.
+ */
+struct obj2mod_modhash_elem {
+	struct obj2mod_elem *elem;
+	unsigned int modhash;		/* hash value of this entry */
+};
+
+static int qstrcmp(const void *a, const void *b)
+{
+	return strcmp((const char *) a, (const char *) b);
+}
+
+static int qmodhash(const void *a, const void *b)
+{
+	const struct obj2mod_modhash_elem *el_a = a;
+	const struct obj2mod_modhash_elem *el_b = b;
+	if (el_a->modhash < el_b->modhash)
+		return -1;
+	else if (el_a->modhash > el_b->modhash)
+		return 1;
+	return 0;
+}
+
+/*
+ * Associate all TUs in obj2mod which refer to the same module with a single
+ * obj2mod entry for emission, preferring to point into the module list in a
+ * multi-module objfile.
+ */
+static void optimize_obj2mod(void)
+{
+	size_t i;
+	size_t n = 0;
+	struct obj2mod_elem *elem;
+	struct obj2mod_elem *dedup;
+	/* An array of all obj2mod_elems, later sorted by hashval.  */
+	struct obj2mod_modhash_elem *uniq;
+	struct obj2mod_modhash_elem *last;
+
+	/*
+	 * Canonicalize all module lists by sorting them, then compute their
+	 * hash values.
+	 */
+	uniq = malloc(sizeof(struct obj2mod_modhash_elem) * num_objfiles);
+	if (uniq == NULL)
+		goto oom;
+
+	for (i = 0; i < OBJ2MOD_N; i++) {
+		for (elem = obj2mod[i]; elem; elem = elem->obj2mod_next) {
+			if (elem->nmods >= 2) {
+				char **sorter;
+				char *walk;
+				char *tmp_mods;
+				size_t j;
+
+				tmp_mods = malloc(elem->mods_size);
+				sorter = malloc(sizeof(char *) * elem->nmods);
+				if (sorter == NULL || tmp_mods == NULL)
+					goto oom;
+				memcpy(tmp_mods, elem->mods, elem->mods_size);
+
+				for (j = 0, walk = tmp_mods; j < elem->nmods;
+				     j++) {
+					sorter[j] = walk;
+					walk += strlen(walk) + 1;
+				}
+				qsort(sorter, elem->nmods, sizeof (char *),
+				      qstrcmp);
+				for (j = 0, walk = elem->mods; j < elem->nmods;
+				     j++) {
+					strcpy(walk, sorter[j]);
+					walk += strlen(walk) + 1;
+				}
+				free(tmp_mods);
+				free(sorter);
+			}
+
+			uniq[n].elem = elem;
+			uniq[n].modhash = memhash(elem->mods, elem->mods_size);
+			n++;
+		}
+	}
+
+	qsort (uniq, num_objfiles, sizeof (struct obj2mod_modhash_elem),
+	       qmodhash);
+
+	/*
+	 * Work over multimodule entries.  These must be emitted into
+	 * .kallsyms_module_names as a unit, but we can still optimize by
+	 * reusing some other identical entry.  Single-file modules are amenable
+	 * to the same optimization, but we avoid doing it for now so that we
+	 * can prefer to point them directly inside a multimodule entry.
+	 */
+	for (i = 0, last = NULL; i < num_objfiles; i++) {
+		const char *onemod;
+		size_t j;
+
+		if (uniq[i].elem->nmods < 2)
+			continue;
+
+		/* Duplicate multimodule.  Reuse the first we saw.  */
+		if (last != NULL && last->modhash == uniq[i].modhash) {
+			uniq[i].elem->xref = last->elem;
+			continue;
+		}
+
+		/*
+		 * Single-module entries relating to modules also emitted as
+		 * part of this multimodule entry can refer to it: later, we
+		 * will hunt down the right specific module name within this
+		 * multimodule entry and point directly to it.
+		 */
+		onemod = uniq[i].elem->mods;
+		for (j = uniq[i].elem->nmods; j > 0; j--) {
+			int h = strhash(onemod) & OBJ2MOD_MASK;
+
+			for (dedup = mod2obj[h]; dedup;
+			     dedup = dedup->mod2obj_next) {
+				if (dedup->nmods > 1)
+					continue;
+
+				if (strcmp(dedup->mods, onemod) != 0)
+					continue;
+				dedup->xref = uniq[i].elem;
+				assert (uniq[i].elem->xref == NULL);
+			}
+			onemod += strlen(onemod) + 1;
+		}
+
+		last = &uniq[i];
+	}
+
+	/*
+	 * Now traverse all single-module entries, xreffing every one that
+	 * relates to a given module to the first one we saw that refers to that
+	 * module.
+	 */
+	for (i = 0, last = NULL; i < num_objfiles; i++) {
+		if (uniq[i].elem->nmods > 1)
+			continue;
+
+		if (uniq[i].elem->xref != NULL)
+			continue;
+
+		/* Duplicate module name.  Reuse the first we saw.  */
+		if (last != NULL && last->modhash == uniq[i].modhash) {
+			uniq[i].elem->xref = last->elem;
+			assert (last->elem->xref == NULL);
+			continue;
+		}
+		last = &uniq[i];
+	}
+	return;
+oom:
+	fprintf(stderr, "kallsyms: out of memory optimizing module list\n");
+	exit(EXIT_FAILURE);
+}
 #endif /* CONFIG_KALLMODSYMS */
 
 static void usage(void)
@@ -479,7 +667,7 @@ static void output_kallmodsyms_modules(void)
 	size_t i;
 
 	/*
-	 * Traverse and emit, updating mod_offset accordingly.
+	 * Traverse and emit, chasing xref and updating mod_offset accordingly.
 	 * Emit a single \0 at the start, to encode non-modular objfiles.
 	 */
 	output_label("kallsyms_module_names");
@@ -489,9 +677,15 @@ static void output_kallmodsyms_modules(void)
 		     elem = elem->obj2mod_next) {
 			const char *onemod;
 			size_t i;
+			struct obj2mod_elem *out_elem = elem;
 
-			elem->mod_offset = offset;
-			onemod = elem->mods;
+			if (elem->xref)
+				out_elem = elem->xref;
+			if (out_elem->mod_offset != 0)
+				continue;	/* Already emitted.  */
+
+			out_elem->mod_offset = offset;
+			onemod = out_elem->mods;
 
 			/*
 			 * Technically this is a waste of space: we could just
@@ -500,13 +694,13 @@ static void output_kallmodsyms_modules(void)
 			 * entry, but doing it this way makes it more obvious
 			 * when an entry is a multimodule entry.
 			 */
-			if (elem->nmods != 1) {
+			if (out_elem->nmods != 1) {
 				printf("\t.byte\t0\n");
-				printf("\t.byte\t%zi\n", elem->nmods);
+				printf("\t.byte\t%zi\n", out_elem->nmods);
 				offset += 2;
 			}
 
-			for (i = elem->nmods; i > 0; i--) {
+			for (i = out_elem->nmods; i > 0; i--) {
 				printf("\t.asciz\t\"%s\"\n", onemod);
 				offset += strlen(onemod) + 1;
 				onemod += strlen(onemod) + 1;
@@ -533,6 +727,13 @@ static void output_kallmodsyms_objfiles(void)
 		long long offset;
 		int overflow;
 
+                /*
+                 * Fuse consecutive address ranges citing the same object file
+                 * into one.
+                 */
+                if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile)
+                        continue;
+
 		if (base_relative) {
 			if (!absolute_percpu) {
 				offset = addrmap[i].addr - relative_base;
@@ -558,6 +759,13 @@ static void output_kallmodsyms_objfiles(void)
 
 	for (i = 0; i < addrmap_num; i++) {
 		struct obj2mod_elem *elem = addrmap[i].objfile;
+		int orig_nmods;
+		const char *orig_modname;
+		int mod_offset;
+
+		if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile)
+			continue;
+
 		/*
 		 * Address range cites no object file: point at 0, the built-in
 		 * module.
@@ -568,13 +776,53 @@ static void output_kallmodsyms_objfiles(void)
 			continue;
 		}
 
+		orig_nmods = elem->nmods;
+		orig_modname = elem->mods;
+
+		/*
+		 * Chase down xrefs, if need be.  There can only be one layer of
+		 * these: from single-module entry to other single-module entry,
+		 * or from single- or multi-module entry to another multi-module
+		 * entry.  Single -> single and multi -> multi always points at
+		 * the start of the xref target, so its offset can be used as is.
+		 */
+		if (elem->xref)
+			elem = elem->xref;
+
+		if (elem->nmods == 1 || orig_nmods > 1)
+			mod_offset = elem->mod_offset;
+		else {
+			/*
+			 * If this is a reference from a single-module entry to
+			 * a multi-module entry, hunt down the offset to this
+			 * specific module's name (which is guaranteed to be
+			 * present: see optimize_obj2mod).
+			 */
+
+			size_t j = elem->nmods;
+			const char *onemod = elem->mods;
+			mod_offset = elem->mod_offset;
+
+			for (; j > 0; j--) {
+				if (strcmp(orig_modname, onemod) == 0)
+					break;
+				onemod += strlen(onemod) + 1;
+			}
+			assert (j > 0);
+			/*
+			 * +2 to skip the null byte and count at the start of
+			 * the multimodule entry.
+			 */
+			mod_offset += onemod - elem->mods + 2;
+		}
+
 		/*
 		 * Zero offset is the initial \0, there to catch uninitialized
 		 * obj2mod entries, and is forbidden.
 		 */
-		assert (elem->mod_offset != 0);
+		assert (mod_offset != 0);
 
-		printf("\t.long\t0x%x\n", elem->mod_offset);
+		printf("\t.long\t0x%x\n", mod_offset);
 		emitted_objfiles++;
 	}
 
@@ -1093,6 +1341,7 @@ static void read_modules(const char *modules_builtin)
 
 	free(module_name);
 	modules_thick_iter_free(i);
+	optimize_obj2mod();
 
 	/*
 	 * Read linker map.
-- 
2.35.0.260.gb82b153193.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 6/6] kallsyms: add /proc/kallmodsyms
  2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
                   ` (4 preceding siblings ...)
  2022-02-08 18:43 ` [PATCH v8 5/6] kallsyms: optimize .kallsyms_modules* Nick Alcock
@ 2022-02-08 18:43 ` Nick Alcock
  2022-02-11 22:36 ` [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Jiri Olsa
  6 siblings, 0 replies; 12+ messages in thread
From: Nick Alcock @ 2022-02-08 18:43 UTC (permalink / raw)
  To: mcgrof, masahiroy
  Cc: jolsa, rostedt, bas, tglozar, Ast-x64, viktor.malik, dxu, acme,
	adrian.hunter, ak, irogers, linux-kbuild, linux-modules,
	linux-kernel, arnd, akpm, eugene.loh, kris.van.hees

Use the tables added in the previous commits to introduce a new
/proc/kallmodsyms, in which [module names] are also given for things
that *could* have been modular had they not been built in to the kernel.
So symbols that are part of, say, ext4 are reported as [ext4] even if
ext4 happens to be buiilt in to the kernel in this configuration.

Symbols that are part of multiple modules at the same time are shown
with [multiple] [module names]: consumers will have to be ready to
handle such lines.  Also, kernel symbols for built-in modules will be
sorted by size, as usual for the core kernel, so will probably appear
interspersed with other symbols that are part of different modules and
non-modular always-built-in symbols, which, as usual, have no
square-bracketed module denotation.  This differs from /proc/kallsyms,
where all symbols associated with a module will always appear in a group
(and randomly ordered).

The result looks like this:

ffffffff8b013d20 t pt_buffer_setup_aux
ffffffff8b014130 T intel_pt_interrupt
ffffffff8b014250 T cpu_emergency_stop_pt
ffffffff8b014280 t rapl_pmu_event_init      [intel_rapl_perf]
ffffffff8b0143c0 t rapl_event_update [intel_rapl_perf]
ffffffff8b014480 t rapl_pmu_event_read       [intel_rapl_perf]
ffffffff8b014490 t rapl_cpu_offline  [intel_rapl_perf]
ffffffff8b014540 t __rapl_event_show [intel_rapl_perf]
ffffffff8b014570 t rapl_pmu_event_stop       [intel_rapl_perf]

This is emitted even if intel_rapl_perf is built into the kernel (but,
obviously, not if it's not in the .config at all, or is in a module that
is not loaded).

Further down, we see what happens when object files are reused by
multiple modules, all of which are built in to the kernel:

ffffffffa22b3aa0 t handle_timestamp  [liquidio]
ffffffffa22b3b50 t free_netbuf       [liquidio]
ffffffffa22b3ba0 t liquidio_ptp_settime      [liquidio]
ffffffffa22b3c30 t liquidio_ptp_adjfreq      [liquidio]
[...]
ffffffffa22b9490 t lio_vf_rep_create        [liquidio]
ffffffffa22b96a0 t lio_vf_rep_destroy       [liquidio]
ffffffffa22b9810 t lio_vf_rep_modinit        [liquidio]
ffffffffa22b9830 t lio_vf_rep_modexit        [liquidio]
ffffffffa22b9850 t lio_ethtool_get_channels   [liquidio] [liquidio_vf]
ffffffffa22b9930 t lio_ethtool_get_ringparam  [liquidio] [liquidio_vf]
ffffffffa22b99d0 t lio_get_msglevel   [liquidio] [liquidio_vf]
ffffffffa22b99f0 t lio_vf_set_msglevel        [liquidio] [liquidio_vf]
ffffffffa22b9a10 t lio_get_pauseparam         [liquidio] [liquidio_vf]
ffffffffa22b9a40 t lio_get_ethtool_stats     [liquidio] [liquidio_vf]
ffffffffa22ba180 t lio_vf_get_ethtool_stats  [liquidio] [liquidio_vf]
ffffffffa22ba4f0 t lio_get_regs_len   [liquidio] [liquidio_vf]
ffffffffa22ba530 t lio_get_priv_flags         [liquidio] [liquidio_vf]
ffffffffa22ba550 t lio_set_priv_flags         [liquidio] [liquidio_vf]
ffffffffa22ba580 t lio_set_fecparam   [liquidio] [liquidio_vf]
ffffffffa22ba5f0 t lio_get_fecparam   [liquidio] [liquidio_vf]
[...]
ffffffffa22cbd10 t liquidio_set_mac [liquidio_vf]
ffffffffa22cbe90 t handle_timestamp  [liquidio_vf]
ffffffffa22cbf40 t free_netbuf       [liquidio_vf]
ffffffffa22cbf90 t octnet_link_status_change [liquidio_vf]
ffffffffa22cbfc0 t liquidio_vxlan_port_command.constprop.0   [liquidio_vf]

Like /proc/kallsyms, the output is driven by address, so keeps the
curious property of /proc/kallsyms that symbols (like free_netbuf above)
may appear repeatedly with different addresses: but now, unlike in
/proc/kallsyms, we can see that those symbols appear repeatedly because
they are *different symbols* that ultimately belong to different
modules, all of which are built in to the kernel.

As with /proc/kallsyms, non-root usage produces addresses that are
all zero.

I am not wedded to the name or format of /proc/kallmodsyms, but felt it
best to split it out of /proc/kallsyms to avoid breaking existing
kallsyms parsers.  Another possible syntax might be to use {curly
brackets} or something to denote built-in modules: it might be possible
to drop /proc/kallmodsyms and make /proc/kallsyms emit things in this
format.  (Equally, now kallmodsyms data uses very little space, the
CONFIG_KALLMODSYMS config option might be something people don't want to
bother with.)

Internally, this uses a new kallsyms_builtin_module_address() almost
identical to kallsyms_sym_address() to get the address corresponding to
a given .kallsyms_modules index, and a new get_builtin_module_idx quite
similar to get_symbol_pos to determine the index in the
.kallsyms_modules array that relates to a given address.  Save a little
time by exploiting the fact that all callers will only ever traverse
this list from start to end by allowing them to pass in the previous
index returned from this function as a hint: thus very few bsearches are
actually needed.  (In theory this could change to just walk straight
down kallsyms_module_addresses/offsets and not bother bsearching at all,
but doing it this way is hardly any slower and much more robust.)

We explicitly filter out displaying modules for non-text symbols and
section start/end symbols, because the former appear in distinct ranges
which are not covered by the linker-generated ranges and the latter
appear as text symbols and can share an address with built-in module
symbols, but are not part of those modules.

The display process is complicated a little by the weird format of the
.kallsyms_module_names table: we have to look for multimodule entries
and print them as space-separated lists of module names.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
---
 kernel/kallsyms.c | 245 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 230 insertions(+), 15 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 951c93216fc4..31469565967e 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -48,8 +48,18 @@ __section(".rodata") __attribute__((weak));
 extern const unsigned long kallsyms_relative_base
 __section(".rodata") __attribute__((weak));
 
+extern const unsigned long kallsyms_num_modules
+__section(".rodata") __attribute__((weak));
+
+extern const unsigned long kallsyms_module_names_len
+__section(".rodata") __attribute__((weak));
+
 extern const char kallsyms_token_table[] __weak;
 extern const u16 kallsyms_token_index[] __weak;
+extern const unsigned long kallsyms_module_addresses[] __weak;
+extern const int kallsyms_module_offsets[] __weak;
+extern const u32 kallsyms_modules[] __weak;
+extern const char kallsyms_module_names[] __weak;
 
 extern const unsigned int kallsyms_markers[] __weak;
 
@@ -205,6 +215,25 @@ static bool cleanup_symbol_name(char *s)
 	return false;
 }
 
+#ifdef CONFIG_KALLMODSYMS
+static unsigned long kallsyms_builtin_module_address(int idx)
+{
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		return kallsyms_module_addresses[idx];
+
+	/* values are unsigned offsets if --absolute-percpu is not in effect */
+	if (!IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU))
+		return kallsyms_relative_base + (u32)kallsyms_module_offsets[idx];
+
+	/* ...otherwise, positive offsets are absolute values */
+	if (kallsyms_module_offsets[idx] >= 0)
+		return kallsyms_module_offsets[idx];
+
+	/* ...and negative offsets are relative to kallsyms_relative_base - 1 */
+	return kallsyms_relative_base - 1 - kallsyms_module_offsets[idx];
+}
+#endif
+
 /* Lookup the address for this symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name)
 {
@@ -309,6 +338,54 @@ static unsigned long get_symbol_pos(unsigned long addr,
 	return low;
 }
 
+/*
+ * The caller passes in an address, and we return an index to the corresponding
+ * builtin module index in .kallsyms_modules, or (unsigned long) -1 if none
+ * match.
+ *
+ * The hint_idx, if set, is a hint as to the possible return value, to handle
+ * the common case in which consecutive runs of addresses relate to the same
+ * index.
+ */
+#ifdef CONFIG_KALLMODSYMS
+static unsigned long get_builtin_module_idx(unsigned long addr, unsigned long hint_idx)
+{
+	unsigned long low, high, mid;
+
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		BUG_ON(!kallsyms_module_addresses);
+	else
+		BUG_ON(!kallsyms_module_offsets);
+
+	/*
+	 * Do a binary search on the sorted kallsyms_modules array.  The last
+	 * entry in this array indicates the end of the text section, not an
+	 * object file.
+	 */
+	low = 0;
+	high = kallsyms_num_modules - 1;
+
+	if (hint_idx > low && hint_idx < (high - 1) &&
+	    addr >= kallsyms_builtin_module_address(hint_idx) &&
+	    addr < kallsyms_builtin_module_address(hint_idx + 1))
+		return hint_idx;
+
+	if (addr >= kallsyms_builtin_module_address(low)
+	    && addr < kallsyms_builtin_module_address(high)) {
+		while (high - low > 1) {
+			mid = low + (high - low) / 2;
+			if (kallsyms_builtin_module_address(mid) <= addr)
+				low = mid;
+			else
+				high = mid;
+		}
+		return low;
+	}
+
+	return (unsigned long) -1;
+}
+#endif
+
 /*
  * Lookup an address but don't bother to find any names.
  */
@@ -580,6 +657,8 @@ struct kallsym_iter {
 	char type;
 	char name[KSYM_NAME_LEN];
 	char module_name[MODULE_NAME_LEN];
+	const char *builtin_module_names;
+	unsigned long hint_builtin_module_idx;
 	int exported;
 	int show_value;
 };
@@ -610,6 +689,8 @@ static int get_ksymbol_mod(struct kallsym_iter *iter)
 				     &iter->value, &iter->type,
 				     iter->name, iter->module_name,
 				     &iter->exported);
+	iter->builtin_module_names = NULL;
+
 	if (ret < 0) {
 		iter->pos_mod_end = iter->pos;
 		return 0;
@@ -629,6 +710,8 @@ static int get_ksymbol_ftrace_mod(struct kallsym_iter *iter)
 					 &iter->value, &iter->type,
 					 iter->name, iter->module_name,
 					 &iter->exported);
+	iter->builtin_module_names = NULL;
+
 	if (ret < 0) {
 		iter->pos_ftrace_mod_end = iter->pos;
 		return 0;
@@ -643,6 +726,7 @@ static int get_ksymbol_bpf(struct kallsym_iter *iter)
 
 	strlcpy(iter->module_name, "bpf", MODULE_NAME_LEN);
 	iter->exported = 0;
+	iter->builtin_module_names = NULL;
 	ret = bpf_get_kallsym(iter->pos - iter->pos_ftrace_mod_end,
 			      &iter->value, &iter->type,
 			      iter->name);
@@ -663,23 +747,56 @@ static int get_ksymbol_kprobe(struct kallsym_iter *iter)
 {
 	strlcpy(iter->module_name, "__builtin__kprobes", MODULE_NAME_LEN);
 	iter->exported = 0;
+	iter->builtin_module_names = NULL;
 	return kprobe_get_kallsym(iter->pos - iter->pos_bpf_end,
 				  &iter->value, &iter->type,
 				  iter->name) < 0 ? 0 : 1;
 }
 
 /* Returns space to next name. */
-static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
+static unsigned long get_ksymbol_core(struct kallsym_iter *iter, int kallmodsyms)
 {
 	unsigned off = iter->nameoff;
 
-	iter->module_name[0] = '\0';
+	iter->exported = 0;
 	iter->value = kallsyms_sym_address(iter->pos);
 
 	iter->type = kallsyms_get_symbol_type(off);
 
+	iter->module_name[0] = '\0';
+	iter->builtin_module_names = NULL;
+
 	off = kallsyms_expand_symbol(off, iter->name, ARRAY_SIZE(iter->name));
+#ifdef CONFIG_KALLMODSYMS
+	if (kallmodsyms) {
+		unsigned long mod_idx = (unsigned long) -1;
+
+		if (kallsyms_module_offsets)
+			mod_idx =
+			  get_builtin_module_idx(iter->value,
+						 iter->hint_builtin_module_idx);
 
+		/*
+		 * This is a built-in module iff the tables of built-in modules
+		 * (address->module name mappings) and module names are known,
+		 * and if the address was found there, and if the corresponding
+		 * module index is nonzero, and iff this is a text (or weak)
+		 * symbol.  All other cases mean off the end of the binary or in
+		 * a non-modular range in between one or more modules.  (Also
+		 * guard against a corrupt kallsyms_objfiles array pointing off
+		 * the end of kallsyms_modules.)
+		 */
+		if (kallsyms_modules != NULL && kallsyms_module_names != NULL &&
+		    (iter->type == 't' || iter->type == 'T' ||
+		     iter->type == 'w' || iter->type == 'W') &&
+		    mod_idx != (unsigned long) -1 &&
+		    kallsyms_modules[mod_idx] != 0 &&
+		    kallsyms_modules[mod_idx] < kallsyms_module_names_len)
+			iter->builtin_module_names =
+				&kallsyms_module_names[kallsyms_modules[mod_idx]];
+		iter->hint_builtin_module_idx = mod_idx;
+	}
+#endif
 	return off - iter->nameoff;
 }
 
@@ -725,7 +842,7 @@ static int update_iter_mod(struct kallsym_iter *iter, loff_t pos)
 }
 
 /* Returns false if pos at or past end of file. */
-static int update_iter(struct kallsym_iter *iter, loff_t pos)
+static int update_iter(struct kallsym_iter *iter, loff_t pos, int kallmodsyms)
 {
 	/* Module symbols can be accessed randomly. */
 	if (pos >= kallsyms_num_syms)
@@ -735,7 +852,7 @@ static int update_iter(struct kallsym_iter *iter, loff_t pos)
 	if (pos != iter->pos)
 		reset_iter(iter, pos);
 
-	iter->nameoff += get_ksymbol_core(iter);
+	iter->nameoff += get_ksymbol_core(iter, kallmodsyms);
 	iter->pos++;
 
 	return 1;
@@ -745,14 +862,14 @@ static void *s_next(struct seq_file *m, void *p, loff_t *pos)
 {
 	(*pos)++;
 
-	if (!update_iter(m->private, *pos))
+	if (!update_iter(m->private, *pos, 0))
 		return NULL;
 	return p;
 }
 
 static void *s_start(struct seq_file *m, loff_t *pos)
 {
-	if (!update_iter(m->private, *pos))
+	if (!update_iter(m->private, *pos, 0))
 		return NULL;
 	return m->private;
 }
@@ -761,7 +878,7 @@ static void s_stop(struct seq_file *m, void *p)
 {
 }
 
-static int s_show(struct seq_file *m, void *p)
+static int s_show_internal(struct seq_file *m, void *p, int kallmodsyms)
 {
 	void *value;
 	struct kallsym_iter *iter = m->private;
@@ -772,23 +889,67 @@ static int s_show(struct seq_file *m, void *p)
 
 	value = iter->show_value ? (void *)iter->value : NULL;
 
-	if (iter->module_name[0]) {
+	/*
+	 * Real module, or built-in module and /proc/kallsyms being shown.
+	 */
+	if (iter->module_name[0] != '\0' ||
+	    (iter->builtin_module_names != NULL && kallmodsyms != 0)) {
 		char type;
 
 		/*
-		 * Label it "global" if it is exported,
-		 * "local" if not exported.
+		 * Label it "global" if it is exported, "local" if not exported.
 		 */
 		type = iter->exported ? toupper(iter->type) :
 					tolower(iter->type);
-		seq_printf(m, "%px %c %s\t[%s]\n", value,
-			   type, iter->name, iter->module_name);
+#ifdef CONFIG_KALLMODSYMS
+		if (kallmodsyms) {
+			/*
+			 * /proc/kallmodsyms, built as a module.
+			 */
+			if (iter->builtin_module_names == NULL)
+				seq_printf(m, "%px %c %s\t[%s]\n", value,
+					   type, iter->name,
+					   iter->module_name);
+			/*
+			 * /proc/kallmodsyms, single-module symbol.
+			 */
+			else if (*iter->builtin_module_names != '\0')
+				seq_printf(m, "%px %c %s\t[%s]\n", value,
+					   type, iter->name,
+					   iter->builtin_module_names);
+			/*
+			 * /proc/kallmodsyms, multimodule symbol.  Formatted
+			 * as \0MODULE_COUNTmodule-1\0module-2\0, where
+			 * MODULE_COUNT is a single byte, 2 or higher.
+			 */
+			else {
+				size_t i = *(char *)(iter->builtin_module_names + 1);
+				const char *walk = iter->builtin_module_names + 2;
+
+				seq_printf(m, "%px %c %s\t[%s]", value,
+					   type, iter->name, walk);
+
+                                while (--i > 0) {
+					walk += strlen(walk) + 1;
+					seq_printf (m, " [%s]", walk);
+				}
+				seq_printf(m, "\n");
+			}
+		} else				/* !kallmodsyms */
+#endif /* CONFIG_KALLMODSYMS */
+			seq_printf(m, "%px %c %s\t[%s]\n", value,
+				   type, iter->name, iter->module_name);
 	} else
 		seq_printf(m, "%px %c %s\n", value,
 			   iter->type, iter->name);
 	return 0;
 }
 
+static int s_show(struct seq_file *m, void *p)
+{
+	return s_show_internal(m, p, 0);
+}
+
 static const struct seq_operations kallsyms_op = {
 	.start = s_start,
 	.next = s_next,
@@ -796,6 +957,35 @@ static const struct seq_operations kallsyms_op = {
 	.show = s_show
 };
 
+#ifdef CONFIG_KALLMODSYMS
+static int s_mod_show(struct seq_file *m, void *p)
+{
+	return s_show_internal(m, p, 1);
+}
+static void *s_mod_next(struct seq_file *m, void *p, loff_t *pos)
+{
+	(*pos)++;
+
+	if (!update_iter(m->private, *pos, 1))
+		return NULL;
+	return p;
+}
+
+static void *s_mod_start(struct seq_file *m, loff_t *pos)
+{
+	if (!update_iter(m->private, *pos, 1))
+		return NULL;
+	return m->private;
+}
+
+static const struct seq_operations kallmodsyms_op = {
+	.start = s_mod_start,
+	.next = s_mod_next,
+	.stop = s_stop,
+	.show = s_mod_show
+};
+#endif
+
 static inline int kallsyms_for_perf(void)
 {
 #ifdef CONFIG_PERF_EVENTS
@@ -831,7 +1021,8 @@ bool kallsyms_show_value(const struct cred *cred)
 	}
 }
 
-static int kallsyms_open(struct inode *inode, struct file *file)
+static int kallsyms_open_internal(struct inode *inode, struct file *file,
+	const struct seq_operations *ops)
 {
 	/*
 	 * We keep iterator in m->private, since normal case is to
@@ -839,7 +1030,7 @@ static int kallsyms_open(struct inode *inode, struct file *file)
 	 * using get_symbol_offset for every symbol.
 	 */
 	struct kallsym_iter *iter;
-	iter = __seq_open_private(file, &kallsyms_op, sizeof(*iter));
+	iter = __seq_open_private(file, ops, sizeof(*iter));
 	if (!iter)
 		return -ENOMEM;
 	reset_iter(iter, 0);
@@ -852,6 +1043,18 @@ static int kallsyms_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
+static int kallsyms_open(struct inode *inode, struct file *file)
+{
+	return kallsyms_open_internal(inode, file, &kallsyms_op);
+}
+
+#ifdef CONFIG_KALLMODSYMS
+static int kallmodsyms_open(struct inode *inode, struct file *file)
+{
+	return kallsyms_open_internal(inode, file, &kallmodsyms_op);
+}
+#endif
+
 #ifdef	CONFIG_KGDB_KDB
 const char *kdb_walk_kallsyms(loff_t *pos)
 {
@@ -862,7 +1065,7 @@ const char *kdb_walk_kallsyms(loff_t *pos)
 		reset_iter(&kdb_walk_kallsyms_iter, 0);
 	}
 	while (1) {
-		if (!update_iter(&kdb_walk_kallsyms_iter, *pos))
+		if (!update_iter(&kdb_walk_kallsyms_iter, *pos, 0))
 			return NULL;
 		++*pos;
 		/* Some debugging symbols have no name.  Ignore them. */
@@ -879,9 +1082,21 @@ static const struct proc_ops kallsyms_proc_ops = {
 	.proc_release	= seq_release_private,
 };
 
+#ifdef CONFIG_KALLMODSYMS
+static const struct proc_ops kallmodsyms_proc_ops = {
+	.proc_open	= kallmodsyms_open,
+	.proc_read	= seq_read,
+	.proc_lseek	= seq_lseek,
+	.proc_release	= seq_release_private,
+};
+#endif
+
 static int __init kallsyms_init(void)
 {
 	proc_create("kallsyms", 0444, NULL, &kallsyms_proc_ops);
+#ifdef CONFIG_KALLMODSYMS
+	proc_create("kallmodsyms", 0444, NULL, &kallmodsyms_proc_ops);
+#endif
 	return 0;
 }
 device_initcall(kallsyms_init);
-- 
2.35.0.260.gb82b153193.dirty


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 2/6] kbuild: add modules_thick.builtin
  2022-02-08 18:43 ` [PATCH v8 2/6] kbuild: add modules_thick.builtin Nick Alcock
@ 2022-02-10  0:35   ` Masahiro Yamada
  2022-02-10 12:55     ` Nick Alcock
  0 siblings, 1 reply; 12+ messages in thread
From: Masahiro Yamada @ 2022-02-10  0:35 UTC (permalink / raw)
  To: Nick Alcock
  Cc: Luis R. Rodriguez, Jiri Olsa, Steven Rostedt, bas, tglozar,
	Ast-x64, viktor.malik, Daniel Xu, Arnaldo Carvalho de Melo,
	Adrian Hunter, Andi Kleen, Ian Rogers, Linux Kbuild mailing list,
	linux-modules, Linux Kernel Mailing List, Arnd Bergmann,
	Andrew Morton, Eugene Loh, Kris Van Hees

On Wed, Feb 9, 2022 at 3:44 AM Nick Alcock <nick.alcock@oracle.com> wrote:
>
> This is similar to modules.builtin, and constructed in a similar way to
> the way that used to be built before commit
> 8b41fc4454e36fbfdbb23f940d023d4dece2de29, via tristate.conf inclusion
> and recursive concatenation up the tree. Unlike modules.builtin,
> modules_thick.builtin givs the names of the object files that make up
> modules that are comprised of more than one object file, using a syntax
> similar to that of makefiles, e.g.:
>
> crypto/crypto.o: crypto/api.o crypto/cipher.o crypto/compress.o crypto/memneq.o
> crypto/crypto_algapi.o: crypto/algapi.o crypto/proc.o crypto/scatterwalk.o
> crypto/aead.o:
> crypto/geniv.o:
>
> (where the latter two are single-file modules).
>
> An upcoming commit will use this mapping to populate /proc/kallmodsyms.
>
> A parser is included that yields a stram of (module, objfile name[])
> mappings: it's a bit baroque, but then parsing text files in C is quite
> painful, and I'd rather put the complexity in here than in its callers.
> The parser is not built in this commit, nor does it have any callers
> yet; nor is any rule added that causes modules_thick.builtin to actually
> be constructed.  (Again, see a later commit for that.)
>
> I am not wedded to the approach used to construct this file, but I don't
> see any other way to do it despite spending a week or so trying to tie
> it into Kbuild without using a separate Makefile.modbuiltin: unlike the
> names of builtin modules (which are also recorded in the source files
> themseves via MODULE_*() macros) the mapping from object file name to
> built-in module name is not recorded anywhere but in the makefiles
> themselves, so we have to at least reparse them with something to
> indicate the builtin-ness of each module (i.e., tristate.conf) if we are
> to figure out which modules are built-in and which are not.
>
> Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>


modules.builtin was initially implemented in a terrible way,
hence I cleaned up the code and removed the double recursion
of the source tree.

Honestly, I do not want to see you bringing back
all the bloat.





> ---
>  .gitignore                  |   1 +
>  Documentation/dontdiff      |   1 +
>  Makefile                    |  19 +++-
>  scripts/Kbuild.include      |   6 ++
>  scripts/Makefile.modbuiltin |  56 ++++++++++
>  scripts/modules_thick.c     | 200 ++++++++++++++++++++++++++++++++++++
>  scripts/modules_thick.h     |  48 +++++++++
>  7 files changed, 330 insertions(+), 1 deletion(-)
>  create mode 100644 scripts/Makefile.modbuiltin
>  create mode 100644 scripts/modules_thick.c
>  create mode 100644 scripts/modules_thick.h
>
> diff --git a/.gitignore b/.gitignore
> index 7afd412dadd2..b49cd96f587a 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -49,6 +49,7 @@
>  *.zst
>  Module.symvers
>  modules.order
> +modules_thick.builtin
>
>  #
>  # Top-level generic files
> diff --git a/Documentation/dontdiff b/Documentation/dontdiff
> index 910b30a2a7d9..6a78988547d0 100644
> --- a/Documentation/dontdiff
> +++ b/Documentation/dontdiff
> @@ -183,6 +183,7 @@ modules.builtin
>  modules.builtin.modinfo
>  modules.nsdeps
>  modules.order
> +modules_thick.builtin
>  modversions.h*
>  nconf
>  nconf-cfg
> diff --git a/Makefile b/Makefile
> index 199b6f388484..5e823fe8390f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1470,6 +1470,23 @@ __modinst_pre:
>
>  endif # CONFIG_MODULES
>
> +# modules_thick.builtin maps from kernel modules (or rather the object file
> +# names they would have had had they not been built in) to their constituent
> +# object files: we can use this to determine which modules any given object
> +# file is part of.  (We cannot eliminate the slight redundancy here without
> +# double-expansion.)
> +
> +modthickbuiltin-files := $(addsuffix /modules_thick.builtin, $(build-dirs))
> +
> +modules_thick.builtin: $(modthickbuiltin-files)
> +       $(Q)$(AWK) '!x[$$0]++' $(addsuffix /$@, $(build-dirs)) > $@
> +
> +# tristate.conf is not included from this Makefile. Add it as a prerequisite
> +# here to make it self-healing in case somebody accidentally removes it.
> +$(modthickbuiltin-files): include/config/tristate.conf
> +       $(Q)$(MAKE) $(modbuiltin)=$(patsubst %/modules_thick.builtin,%,$@) builtin-file=modules_thick.builtin
> +
> +
>  ###
>  # Cleaning is done on three levels.
>  # make clean     Delete most generated files
> @@ -1849,7 +1866,7 @@ clean: $(clean-dirs)
>                 -o -name '*.lex.c' -o -name '*.tab.[ch]' \
>                 -o -name '*.asn1.[ch]' \
>                 -o -name '*.symtypes' -o -name 'modules.order' \
> -               -o -name '.tmp_*.o.*' \
> +               -o -name '.tmp_*.o.*' -o -name modules_thick.builtin \
>                 -o -name '*.c.[012]*.*' \
>                 -o -name '*.ll' \
>                 -o -name '*.gcno' \
> diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
> index 3514c2149e9d..167bbbd5fdf5 100644
> --- a/scripts/Kbuild.include
> +++ b/scripts/Kbuild.include
> @@ -74,6 +74,12 @@ endef
>  # $(Q)$(MAKE) $(build)=dir
>  build := -f $(srctree)/scripts/Makefile.build obj
>
> +###
> +# Shorthand for $(Q)$(MAKE) -f scripts/Makefile.modbuiltin obj=
> +# Usage:
> +# $(Q)$(MAKE) $(modbuiltin)=dir
> +modbuiltin := -f $(srctree)/scripts/Makefile.modbuiltin obj
> +
>  ###
>  # Shorthand for $(Q)$(MAKE) -f scripts/Makefile.dtbinst obj=
>  # Usage:
> diff --git a/scripts/Makefile.modbuiltin b/scripts/Makefile.modbuiltin
> new file mode 100644
> index 000000000000..a27b692ea795
> --- /dev/null
> +++ b/scripts/Makefile.modbuiltin
> @@ -0,0 +1,56 @@
> +# SPDX-License-Identifier: GPL-2.0
> +# ==========================================================================
> +# Generating modules_thick.builtin
> +# ==========================================================================
> +
> +src := $(obj)
> +
> +PHONY := __modbuiltin
> +__modbuiltin:
> +
> +include include/config/auto.conf
> +# tristate.conf sets tristate variables to uppercase 'Y' or 'M'
> +# That way, we get the list of built-in modules in obj-Y
> +include include/config/tristate.conf
> +
> +include scripts/Kbuild.include
> +
> +ifdef building_out_of_srctree
> +# Create output directory if not already present
> +_dummy := $(shell [ -d $(obj) ] || mkdir -p $(obj))
> +endif
> +
> +# The filename Kbuild has precedence over Makefile
> +kbuild-dir := $(if $(filter /%,$(src)),$(src),$(srctree)/$(src))
> +kbuild-file := $(if $(wildcard $(kbuild-dir)/Kbuild),$(kbuild-dir)/Kbuild,$(kbuild-dir)/Makefile)
> +include $(kbuild-file)
> +
> +include scripts/Makefile.lib
> +
> +modthickbuiltin-subdirs := $(patsubst %,%/modules_thick.builtin, $(subdir-ym))
> +modthickbuiltin-target  := $(obj)/modules_thick.builtin
> +
> +__modbuiltin: $(obj)/$(builtin-file) $(subdir-ym)
> +       @:
> +
> +$(modthickbuiltin-target): $(subdir-ym) FORCE
> +       $(Q) rm -f $@
> +       $(Q) $(foreach mod-o, $(filter %.o,$(obj-Y)),\
> +               printf "%s:" $(addprefix $(obj)/,$(mod-o)) >> $@; \
> +               printf " %s" $(sort $(strip $(addprefix $(obj)/,$($(mod-o:.o=-objs)) \
> +                       $($(mod-o:.o=-y)) $($(mod-o:.o=-Y))))) >> $@; \
> +               printf "\n" >> $@; ) \
> +       cat /dev/null $(modthickbuiltin-subdirs) >> $@;
> +
> +PHONY += FORCE
> +
> +FORCE:
> +
> +# Descending
> +# ---------------------------------------------------------------------------
> +
> +PHONY += $(subdir-ym)
> +$(subdir-ym):
> +       $(Q)$(MAKE) $(modbuiltin)=$@ builtin-file=$(builtin-file)
> +
> +.PHONY: $(PHONY)
> diff --git a/scripts/modules_thick.c b/scripts/modules_thick.c
> new file mode 100644
> index 000000000000..9a15e99c1330
> --- /dev/null
> +++ b/scripts/modules_thick.c
> @@ -0,0 +1,200 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * A simple modules_thick reader.
> + *
> + * (C) 2014, 2021 Oracle, Inc.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <errno.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include "modules_thick.h"
> +
> +/*
> + * Read a modules_thick.builtin file and translate it into a stream of
> + * name / module-name pairs.
> + */
> +
> +/*
> + * Construct a modules_thick.builtin iterator.
> + */
> +struct modules_thick_iter *
> +modules_thick_iter_new(const char *modules_thick_file)
> +{
> +       struct modules_thick_iter *i;
> +
> +       i = calloc(1, sizeof(struct modules_thick_iter));
> +       if (i == NULL)
> +               return NULL;
> +
> +       i->f = fopen(modules_thick_file, "r");
> +
> +       if (i->f == NULL) {
> +               fprintf(stderr, "Cannot open builtin module file %s: %s\n",
> +                       modules_thick_file, strerror(errno));
> +               return NULL;
> +       }
> +
> +       return i;
> +}
> +
> +/*
> + * Iterate, returning a new null-terminated array of object file names, and a
> + * new dynamically-allocated module name.  (The module name passed in is freed.)
> + *
> + * The array of object file names should be freed by the caller: the strings it
> + * points to are owned by the iterator, and should not be freed.
> + */
> +
> +char ** __attribute__((__nonnull__))
> +modules_thick_iter_next(struct modules_thick_iter *i, char **module_name)
> +{
> +       size_t npaths = 1;
> +       char **module_paths;
> +       char *last_slash;
> +       char *last_dot;
> +       char *trailing_linefeed;
> +       char *object_name = i->line;
> +       char *dash;
> +       int composite = 0;
> +
> +       /*
> +        * Read in all module entries, computing the suffixless, pathless name
> +        * of the module and building the next arrayful of object file names for
> +        * return.
> +        *
> +        * Modules can consist of multiple files: in this case, the portion
> +        * before the colon is the path to the module (as before): the portion
> +        * after the colon is a space-separated list of files that should be
> +        * considered part of this module.  In this case, the portion before the
> +        * name is an "object file" that does not actually exist: it is merged
> +        * into built-in.a without ever being written out.
> +        *
> +        * All module names have - translated to _, to match what is done to the
> +        * names of the same things when built as modules.
> +        */
> +
> +       /*
> +        * Reinvocation of exhausted iterator. Return NULL, once.
> +        */
> +retry:
> +       if (getline(&i->line, &i->line_size, i->f) < 0) {
> +               if (ferror(i->f)) {
> +                       fprintf(stderr, "Error reading from modules_thick file:"
> +                               " %s\n", strerror(errno));
> +                       exit(1);
> +               }
> +               rewind(i->f);
> +               return NULL;
> +       }
> +
> +       if (i->line[0] == '\0')
> +               goto retry;
> +
> +       /*
> +        * Slice the line in two at the colon, if any.  If there is anything
> +        * past the ': ', this is a composite module.  (We allow for no colon
> +        * for robustness, even though one should always be present.)
> +        */
> +       if (strchr(i->line, ':') != NULL) {
> +               char *name_start;
> +
> +               object_name = strchr(i->line, ':');
> +               *object_name = '\0';
> +               object_name++;
> +               name_start = object_name + strspn(object_name, " \n");
> +               if (*name_start != '\0') {
> +                       composite = 1;
> +                       object_name = name_start;
> +               }
> +       }
> +
> +       /*
> +        * Figure out the module name.
> +        */
> +       last_slash = strrchr(i->line, '/');
> +       last_slash = (!last_slash) ? i->line :
> +               last_slash + 1;
> +       free(*module_name);
> +       *module_name = strdup(last_slash);
> +       dash = *module_name;
> +
> +       while (dash != NULL) {
> +               dash = strchr(dash, '-');
> +               if (dash != NULL)
> +                       *dash = '_';
> +       }
> +
> +       last_dot = strrchr(*module_name, '.');
> +       if (last_dot != NULL)
> +               *last_dot = '\0';
> +
> +       trailing_linefeed = strchr(object_name, '\n');
> +       if (trailing_linefeed != NULL)
> +               *trailing_linefeed = '\0';
> +
> +       /*
> +        * Multifile separator? Object file names explicitly stated:
> +        * slice them up and shuffle them in.
> +        *
> +        * The array size may be an overestimate if any object file
> +        * names start or end with spaces (very unlikely) but cannot be
> +        * an underestimate.  (Check for it anyway.)
> +        */
> +       if (composite) {
> +               char *one_object;
> +
> +               for (npaths = 0, one_object = object_name;
> +                    one_object != NULL;
> +                    npaths++, one_object = strchr(one_object + 1, ' '));
> +       }
> +
> +       module_paths = malloc((npaths + 1) * sizeof(char *));
> +       if (!module_paths) {
> +               fprintf(stderr, "%s: out of memory on module %s\n", __func__,
> +                       *module_name);
> +               exit(1);
> +       }
> +
> +       if (composite) {
> +               char *one_object;
> +               size_t i = 0;
> +
> +               while ((one_object = strsep(&object_name, " ")) != NULL) {
> +                       if (i >= npaths) {
> +                               fprintf(stderr, "%s: num_objs overflow on module "
> +                                       "%s: this is a bug.\n", __func__,
> +                                       *module_name);
> +                               exit(1);
> +                       }
> +
> +                       module_paths[i++] = one_object;
> +               }
> +       } else
> +               module_paths[0] = i->line;      /* untransformed module name */
> +
> +       module_paths[npaths] = NULL;
> +
> +       return module_paths;
> +}
> +
> +/*
> + * Free an iterator. Can be called while iteration is underway, so even
> + * state that is freed at the end of iteration must be freed here too.
> + */
> +void
> +modules_thick_iter_free(struct modules_thick_iter *i)
> +{
> +       if (i == NULL)
> +               return;
> +       fclose(i->f);
> +       free(i->line);
> +       free(i);
> +}
> diff --git a/scripts/modules_thick.h b/scripts/modules_thick.h
> new file mode 100644
> index 000000000000..f5edcaf9550c
> --- /dev/null
> +++ b/scripts/modules_thick.h
> @@ -0,0 +1,48 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * A simple modules_thick reader.
> + *
> + * (C) 2014, 2021 Oracle, Inc.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#ifndef _LINUX_MODULES_THICK_H
> +#define _LINUX_MODULES_THICK_H
> +
> +#include <stdio.h>
> +#include <stddef.h>
> +
> +/*
> + * modules_thick.builtin iteration state.
> + */
> +struct modules_thick_iter {
> +       FILE *f;
> +       char *line;
> +       size_t line_size;
> +};
> +
> +/*
> + * Construct a modules_thick.builtin iterator.
> + */
> +struct modules_thick_iter *
> +modules_thick_iter_new(const char *modules_thick_file);
> +
> +/*
> + * Iterate, returning a new null-terminated array of object file names, and a
> + * new dynamically-allocated module name.  (The module name passed in is freed.)
> + *
> + * The array of object file names should be freed by the caller: the strings it
> + * points to are owned by the iterator, and should not be freed.
> + */
> +
> +char ** __attribute__((__nonnull__))
> +modules_thick_iter_next(struct modules_thick_iter *i, char **module_name);
> +
> +void
> +modules_thick_iter_free(struct modules_thick_iter *i);
> +
> +#endif
> --
> 2.35.0.260.gb82b153193.dirty
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules
  2022-02-08 18:43 ` [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules Nick Alcock
@ 2022-02-10  0:48   ` Masahiro Yamada
  0 siblings, 0 replies; 12+ messages in thread
From: Masahiro Yamada @ 2022-02-10  0:48 UTC (permalink / raw)
  To: Nick Alcock
  Cc: Luis R. Rodriguez, Jiri Olsa, Steven Rostedt, bas, tglozar,
	Ast-x64, viktor.malik, Daniel Xu, Arnaldo Carvalho de Melo,
	Adrian Hunter, Andi Kleen, Ian Rogers, Linux Kbuild mailing list,
	linux-modules, Linux Kernel Mailing List, Arnd Bergmann,
	Andrew Morton, Eugene Loh, Kris Van Hees

On Wed, Feb 9, 2022 at 3:44 AM Nick Alcock <nick.alcock@oracle.com> wrote:
>
> The mapping consists of three new symbols, computed by integrating the
> information in the (just-added) .tmp_vmlinux.ranges and
> modules_thick.builtin: taken together, they map address ranges
> (corresponding to object files on the input) to the names of zero or
> more modules containing those address ranges.
>
>  - kallsyms_module_addresses/kallsyms_module_offsets encodes the
>    address/offset of each object file (derived from the linker map), in
>    exactly the same way as kallsyms_addresses/kallsyms_offsets does
>    for symbols.  There is no size: instead, the object files are assumed
>    to tile the address space.  (This is slightly more space-efficient
>    than using a size).  Non-text-section addresses are skipped: for now,
>    all the users of this interface only need module/non-module
>    information for instruction pointer addresses, not absolute-addressed
>    symbols and the like.  This restriction can easily be lifted in
>    future.  (Regarding the name: right now the entries correspond pretty
>    closely to object files, so we could call the section
>    kallsyms_objfiles or something, but the optimizer added in the next
>    commit will change this.)
>
>  - kallsyms_module_names encodes the name of each module in a modified
>    form of strtab: notably, if an object file appears in *multiple*
>    modules, all of which are built in, this is encoded via a zero byte,
>    a one-byte module count, then a series of that many null-terminated
>    strings.  As a special case, the table starts with a single zero byte
>    which does *not* represent the start of a multi-module list.
>
>  - kallsyms_modules connects the two, encoding a table associated 1:1
>    with kallsyms_module_addresses / kallsyms_module_offsets, pointing
>    at an offset in kallsyms_module_names describing which module (or
>    modules, for a multi-module list) the code occupying this address
>    range is part of.  If an address range is part of no module (always
>    built-in) it points at 0 (the null byte at the start of the
>    kallsyms_module_names list).
>
> There is no optimization yet: kallsyms_modules and
> kallsyms_module_names will almost certainly contain many duplicate
> entries, and kallsyms_module_{addresses,offsets} may contain
> consecutive entries that point to the same place.  The size hit is
> fairly substantial as a result, though still much less than a naive
> implementation mapping each symbol to a module name would be: 50KiB or
> so.
>
> Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
>  Makefile           |   2 +-
>  init/Kconfig       |   8 +
>  scripts/Makefile   |   6 +
>  scripts/kallsyms.c | 366 +++++++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 371 insertions(+), 11 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 5e823fe8390f..b719244cb571 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1151,7 +1151,7 @@ cmd_link-vmlinux =                                                 \
>         $(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
>         $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
>
> -vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
> +vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) modules_thick.builtin FORCE
>         +$(call if_changed_dep,link-vmlinux)
>
>  targets := vmlinux
> diff --git a/init/Kconfig b/init/Kconfig
> index e9119bf54b1f..e1ca3d70cb1c 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1530,6 +1530,14 @@ config POSIX_TIMERS
>
>           If unsure say y.
>
> +config KALLMODSYMS
> +       default y
> +       bool "Enable support for /proc/kallmodsyms" if EXPERT
> +       depends on KALLSYMS
> +       help
> +         This option enables the /proc/kallmodsyms file, which maps symbols
> +         to addresses and their associated modules.
> +
>  config PRINTK
>         default y
>         bool "Enable support for printk" if EXPERT
> diff --git a/scripts/Makefile b/scripts/Makefile
> index ce5aa9030b74..c5cc4ac3d660 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -29,6 +29,12 @@ ifdef CONFIG_BUILDTIME_MCOUNT_SORT
>  HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED
>  endif
>
> +kallsyms-objs  := kallsyms.o
> +
> +ifdef CONFIG_KALLMODSYMS
> +kallsyms-objs += modules_thick.o
> +endif
> +
>  # The following programs are only built on demand
>  hostprogs += unifdef
>
> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
> index 54ad86d13784..8f87b724d0fa 100644
> --- a/scripts/kallsyms.c
> +++ b/scripts/kallsyms.c
> @@ -5,7 +5,10 @@
>   * This software may be used and distributed according to the terms
>   * of the GNU General Public License, incorporated herein by reference.
>   *
> - * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
> + * Usage: nm -n vmlinux
> + *        | scripts/kallsyms [--all-symbols] [--absolute-percpu]
> + *             [--base-relative] [--builtin=modules_thick.builtin]
> + *        > symbols.S
>   *
>   *      Table compression uses all the unused char codes on the symbols and
>   *  maps these to the most used substrings (tokens). For instance, it might
> @@ -24,6 +27,10 @@
>  #include <string.h>
>  #include <ctype.h>
>  #include <limits.h>
> +#include <assert.h>
> +#include "modules_thick.h"
> +
> +#include "../include/generated/autoconf.h"



I do not remember if I had pointed this out before,
but including autoconf.h from a host program is wrong.

Do not use ifdef CONFIG_...  in the hostprog code.
Having --builtin=modules_thick.builtin is enough.





-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 2/6] kbuild: add modules_thick.builtin
  2022-02-10  0:35   ` Masahiro Yamada
@ 2022-02-10 12:55     ` Nick Alcock
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Alcock @ 2022-02-10 12:55 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Luis R. Rodriguez, Jiri Olsa, Steven Rostedt, bas, tglozar,
	Ast-x64, viktor.malik, Daniel Xu, Arnaldo Carvalho de Melo,
	Adrian Hunter, Andi Kleen, Ian Rogers, Linux Kbuild mailing list,
	linux-modules, Linux Kernel Mailing List, Arnd Bergmann,
	Andrew Morton, Eugene Loh, Kris Van Hees

On 10 Feb 2022, Masahiro Yamada told this:

> On Wed, Feb 9, 2022 at 3:44 AM Nick Alcock <nick.alcock@oracle.com> wrote:
>>
>> I am not wedded to the approach used to construct this file, but I don't
>> see any other way to do it despite spending a week or so trying to tie
>> it into Kbuild without using a separate Makefile.modbuiltin: unlike the
>> names of builtin modules (which are also recorded in the source files
>> themseves via MODULE_*() macros) the mapping from object file name to
>> built-in module name is not recorded anywhere but in the makefiles
>> themselves, so we have to at least reparse them with something to
>> indicate the builtin-ness of each module (i.e., tristate.conf) if we are
>> to figure out which modules are built-in and which are not.
>>
>> Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
>> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
>
> modules.builtin was initially implemented in a terrible way,
> hence I cleaned up the code and removed the double recursion
> of the source tree.

That's why I said I was not wedded to this approach, and I'd be happy
to use another one.

I tried to reimplement it using a single recursion, but it seems to be
impossible or at least very difficult, since you have to somehow
interpret each target in the tree in two ways ("build this" versus
"figure out whether this is built-in or not): if you can think up a
trick, I'll give it a try. I tried to make the core Kbuild makefiles
interpret obj-Y suitably, which might mean you could build the whole
tree with tristate.conf in force as it is in Makefile.modbuiltin, but
got lost in the tangle: and in any case doing that seemed far more
invasive and far more likely to be rejected than just bringing back
Makefile.modbuiltin.

But for now, to me, this seems tolerable. The time costs of building
modules_thick.builtin are so minimal I cannot determine them in a normal
parallel build, well down in the noise, unless your machine is so short
of memory that it can't even cache the directory tree walk (and with the
cache hot, even with -j 1 building it takes only half a second). The
maintenance costs appear to be basically zero as well: one file, not
tightly coupled to the rest (the only coupling it has is the same
coupling as every makefile in the build tree). Also one file that was
present for decades already :)

So... honestly, I don't see your objection. What's terrible about
scripts/Makefile.modbuiltin? If it's so terrible, do you have an
alternative way to do what it does in this patch series?

> Honestly, I do not want to see you bringing back
> all the bloat.

It's come back because I find it necessary for the series which follows,
which surely makes it not bloat. If you can think of another approach
that can similarly figure out which modules are built in and which are
not, I'd be happy to use it, but I can't think of one.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules
  2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
                   ` (5 preceding siblings ...)
  2022-02-08 18:43 ` [PATCH v8 6/6] kallsyms: add /proc/kallmodsyms Nick Alcock
@ 2022-02-11 22:36 ` Jiri Olsa
  2022-02-14 15:40   ` Nick Alcock
  6 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2022-02-11 22:36 UTC (permalink / raw)
  To: Nick Alcock
  Cc: mcgrof, masahiroy, jolsa, rostedt, bas, tglozar, Ast-x64,
	viktor.malik, dxu, acme, adrian.hunter, ak, irogers,
	linux-kbuild, linux-modules, linux-kernel, arnd, akpm,
	eugene.loh, kris.van.hees

On Tue, Feb 08, 2022 at 06:43:03PM +0000, Nick Alcock wrote:
> The kallmodsyms patch series was originally posted in Nov 2019, and the thread
> (https://lore.kernel.org/linux-kbuild/20191114223036.9359-1-eugene.loh@oracle.com/t/#u)
> shows review comments, questions, and feedback from interested parties.
> 
> All review comments have been satisfied, as far as I know: in particular
> Yamada's note about translation units that are shared between built-in modules
> is satisfied with a better representation which is also much, much smaller.
> 
> A kernel tree containing this series alone, atop -rc3:
>    https://github.com/oracle/dtrace-linux-kernel kallmodsyms/5.17-rc3
> 
> Trees for trying this out, if you want to try this series in conjunction
> with its major current user:
> 
>  userspace tree for the dtrace tool itself:
>    https://github.com/oracle/dtrace-utils.git, dev branch
>  kernel tree comprising this series and a few other patches needed by
>  dtrace:
>    https://github.com/oracle/dtrace-linux-kernel, v2/5.17-rc2 branch
> 
> (See the README.md in the latter for dtrace build instructions.  Note the need for a
> reasonably recent binutils, a trunk GCC, and a cross-bpf toolchain.)
> 
> 
> /proc/kallsyms is very useful for tracers and other tools that need to
> map kernel symbols to addresses.
> 
> It would be useful if there were a mapping between kernel symbol and module
> name that only changed when the kernel source code is changed.  This mapping
> should not change simply because a module becomes built into the kernel, so
> that it's not broken by changes in user configuration.  (DTrace for Linux
> already uses the approach in this patch for this purpose.)
> 
> In brief we do this by mapping from address ranges to object files (with
> assistance from the linker map file), then mapping from object files to
> potential kernel modules. Because the number of object files is much smaller
> than the number of symbols, this is a fairly efficient representation, even with
> a bit of extra complexity to allow object files to be in more than one module at
> once.
> 
> The size impact of all of this is minimal: in one of my tests, vmlinux grew by
> 0.17% (10824 bytes), and the compressed vmlinux only grew by 0.08% (7552 bytes):
> though this is very configuration-dependent, it seems likely to scale roughly
> with the kernel as a whole.
> 
> This is all controlled by a new config parameter CONFIG_KALLMODSYMS, which when
> set results in output in /proc/kallmodsyms that looks like this:
> 
> ffffffff8b013d20 409 t pt_buffer_setup_aux
> ffffffff8b014130 11f T intel_pt_interrupt
> ffffffff8b014250 2d T cpu_emergency_stop_pt
> ffffffff8b014280 13a t rapl_pmu_event_init      [intel_rapl_perf]
> ffffffff8b0143c0 bb t rapl_event_update [intel_rapl_perf]
> ffffffff8b014480 10 t rapl_pmu_event_read       [intel_rapl_perf]
> ffffffff8b014490 a3 t rapl_cpu_offline  [intel_rapl_perf]
> ffffffff8b014540 24 t __rapl_event_show [intel_rapl_perf]
> ffffffff8b014570 f2 t rapl_pmu_event_stop       [intel_rapl_perf]

hi,
I tried this version and can't see the symbols size

[root@qemu jolsa]# cat /proc/kallmodsyms | grep ksys_ | head -5
ffffffff81094720 T ksys_ioperm
ffffffff81141110 T ksys_unshare
ffffffff81160410 T ksys_setsid
ffffffff811c64b0 T ksys_sync_helper
ffffffff813213c0 T ksys_fadvise64_64

I have CONFIG_KALLMODSYMS=y, but I haven't checked if I need
anything else

jirka

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules
  2022-02-11 22:36 ` [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Jiri Olsa
@ 2022-02-14 15:40   ` Nick Alcock
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Alcock @ 2022-02-14 15:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: mcgrof, masahiroy, jolsa, rostedt, bas, tglozar, Ast-x64,
	viktor.malik, dxu, acme, adrian.hunter, ak, irogers,
	linux-kbuild, linux-modules, linux-kernel, arnd, akpm,
	eugene.loh, kris.van.hees

On 11 Feb 2022, Jiri Olsa verbalised:
> On Tue, Feb 08, 2022 at 06:43:03PM +0000, Nick Alcock wrote:
>> This is all controlled by a new config parameter CONFIG_KALLMODSYMS, which when
>> set results in output in /proc/kallmodsyms that looks like this:
>> 
>> ffffffff8b013d20 409 t pt_buffer_setup_aux
>> ffffffff8b014130 11f T intel_pt_interrupt
>> ffffffff8b014250 2d T cpu_emergency_stop_pt
>> ffffffff8b014280 13a t rapl_pmu_event_init      [intel_rapl_perf]
>> ffffffff8b0143c0 bb t rapl_event_update [intel_rapl_perf]
>> ffffffff8b014480 10 t rapl_pmu_event_read       [intel_rapl_perf]
>> ffffffff8b014490 a3 t rapl_cpu_offline  [intel_rapl_perf]
>> ffffffff8b014540 24 t __rapl_event_show [intel_rapl_perf]
>> ffffffff8b014570 f2 t rapl_pmu_event_stop       [intel_rapl_perf]
>
> hi,
> I tried this version and can't see the symbols size
>
> [root@qemu jolsa]# cat /proc/kallmodsyms | grep ksys_ | head -5
> ffffffff81094720 T ksys_ioperm
> ffffffff81141110 T ksys_unshare
> ffffffff81160410 T ksys_setsid
> ffffffff811c64b0 T ksys_sync_helper
> ffffffff813213c0 T ksys_fadvise64_64

UGH, sorry, I should have regenerated the output in the cover letter!
The cover letter is buggy :)

This is entirely expected because I dropped the symbol size patch
(because it's formally unnecessary because you can do it from userspace
by examination of vmlinux or the .ko files, and the symbol size
representation is big, adding hundreds of KiB to the kernel image).
And then I failed to regenerate the output to show this :/

In this series, /proc/kallmodsyms now looks identical in format to
/proc/kallsyms, except that you can have [multiple] [modules] on a line
and the meaning of a [module] entry is different. (This may be a small
enough change in semantics that merging the two is possible, but I doubt
it -- existing users will surely expect that a [module] entry means that
module.ko exists, which with /proc/kallmodsyms is not always true.)

-- 
NULL && (void)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-02-14 15:41 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-08 18:43 [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Nick Alcock
2022-02-08 18:43 ` [PATCH v8 1/6] kbuild: bring back tristate.conf Nick Alcock
2022-02-08 18:43 ` [PATCH v8 2/6] kbuild: add modules_thick.builtin Nick Alcock
2022-02-10  0:35   ` Masahiro Yamada
2022-02-10 12:55     ` Nick Alcock
2022-02-08 18:43 ` [PATCH v8 3/6] kbuild: generate an address ranges map at vmlinux link time Nick Alcock
2022-02-08 18:43 ` [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules Nick Alcock
2022-02-10  0:48   ` Masahiro Yamada
2022-02-08 18:43 ` [PATCH v8 5/6] kallsyms: optimize .kallsyms_modules* Nick Alcock
2022-02-08 18:43 ` [PATCH v8 6/6] kallsyms: add /proc/kallmodsyms Nick Alcock
2022-02-11 22:36 ` [PATCH v8] kallsyms: new /proc/kallmodsyms with builtin modules Jiri Olsa
2022-02-14 15:40   ` Nick Alcock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).